Please see my other blog for Oracle EBusiness Suite Posts - EBMentors

Search This Blog

Note: All the posts are based on practical approach avoiding lengthy theory. All have been tested on some development servers. Please don’t test any post on production servers until you are sure.

Monday, December 09, 2013

Exadata: Hybrid Columnar Compression & Considerations

Brief:
Exadata Hybrid Columnar Compression (EHCC) is a type of segment compression. With HCC, data is organized for storage by columns for a set of rows, this collection of groups of rows organized and compressed together is what leads to the “hybrid” tag. 



When data is inserted into HCC segments, Oracle examines the input stream of data from the direct path insert/load operation and divides this stream into arrays of columns. It then employs a compression algorithm, based on the HCC type (QUERY LOW,QUERY HIGH,ARCHIVE LOW,ARCHIVE HIGH), on these columnar values and formats a collection of blocks into a logical construct called the compression unit (CU) for ranges of rows to be inserted. Each CU contains multiple blocks, each of which contains columnar subsets of the entire range of columns in the segment. When the first CU is filled with its rows (again, organized by column), a second CU will be established, then a third, and so on until all of the rows are inserted.


Considerations:
Below are the considerations which should be kept in mind while using the HCC segments.
  • Use the compression advisor to estimate the disk space requirements and compression ratio for an existing, uncompressed table
  • Make sure you have sufficient disk space to execute the advisor as it will temporarily create and subsequently drop an actual physical segment in your database based on the input compression type supplied to DBMS_COMPRESSION.GET_COMPRESSATION_RATIO.
  • Execute the advisor during off-peak hours
  • Data needs to be loaded into HCC segments using direct path load operations (CREATE TABLE AS, direct path INSERT using the APPEND, ALTER TABLE MOVE) in order for the rows to be compressed according to your desired compression type.
  • Rows inserted via conventional inserts into a HCC table or partition, the resulting rows will be inserted uncompressed into normal data blocks (in other words, not with an HCC compression unit).
  • Performing DML on HCC tables is discouraged for performance reasons because when data is updated, the updated row is migrated to another block and re-inserted initially as uncompressed.
  • Updates to a single row in an HCC table will lock the entire CU containing the row so updates to other rows in the blocks are forced to wait for any pending transactions to commit or roll back.
  • If many updates occur, over time the percentage of compressed blocks will decrease and the segment’s compression ratio will decrease.
  • With HCC, compression operations always occur on the compute (DB) nodes as data is inserted via direct path load/insert.
  • Decompression occurs on the storage cells when HCC data is accessed via Smart Scan and the  amount of data is greater than 1 MB. It is indicated by the "cell CUs sent uncompressed" and "cell IO uncompressed bytes" statistics.
  • Any I/O requests smaller than 1 MB in size cause the storage cells to ship compressed CUs to the compute node.
  • When accessed without Smart Scan (index access to HCC segments), the storage cells do not perform any decompression and send compressed blocks back to the compute node’s database buffer cache and uncompressed on compute node.
  • When using RMAN restore to a non-Exadata platform, the data in HCC formatted blocks will not be accessible until the data has been moved into a non-HCC format (available from version 11.2.0.2). This means a lengthy delay before being able to access the data.
  • Think about the space requirements also when moving data into a non-HCC format. If HCC is providing a 10× compression factor, you will need to have 10 times the space you are currently using available on the target environment to handle the increased size of the data.
Mindmap
Below is the mindmap created while studying the topic


No comments: