We all know that now-a-days how many transactions are happening per day in this digital world. To store those data in data warehouse we need more storage disk. So everyone is looking for reducing the cost of storing data. One way is to compress the data in the media. By compressing data we will get some benefits like-
- Reduces the storing cost as we can store more data into the disk.
- Reduces the I/O to read those data.
- Memory will hold more data for processing.
Teradata offers some Technique to compress the data like-
- Multi Value Compression(MVC) – Present since V2R5.
- Algorithmic Compression(ALC) – Present since 13.10.
- Block Level Compression(BLC) – Present since 13.10.
In this article we will discuss about MVC technique.
What is MVC?
Multi value compression is a technique to apply compression on column level rather than row or block level. It is mainly developed for the column which has a large number of repeated values. MVC is dictionary based compression. The main idea behind this is that repeated values will be replaced by the bit pattern which will be saved in the table definition of a table. During the data access, this bit pattern as a part of dictionary will come to memory and used to look up the original value.
We can define the values to be replaced with bit pattern in the CREATE TABLE or can use an ALTER table to modify the columns of an existing table to define multi value compression.
Behind the Scene:-
So how this MVC work? Is there any overhead to compress and decompress the data?
As we got to know that MVC is meant for the repetitive values only. So we need to analyze the column values first before applying MVC. Find the below example for better understanding-
From the above example, we can see that the column “Gender” has been defined as CHAR (1). That means it will occupy 1 byte of space. If we analyze, we will find that for the column “Gender” values will either ‘M’ or ‘F’. So in MVC technique, Teradata will replace these values like- 01 for ‘M’ and 11 for ‘F’ that means only 2 bits instead of 1 byte.
This information will be stored in the header. As the table header becomes memory resident during access or any operation, there will be no overhead to compress and de-compress the data. Here’s how table header and data will look like after applying MVC-
Like this we can apply MVC on column having repeated values and data type like BYTE, VARCHAR, INTEGER, DATE, DECIMAL, etc. We can apply MVC on 255 values per column and due to MVC, table header size will increase depending on number of values we are going to compress.