AWS Redshift Table Redesign
AWS Redshift Table Redesign

To redesign an AWS Redshift table optimally, you’d typically consider factors like data access patterns, scalability, performance, and cost-effectiveness. Here are some steps you can follow:
- Run Vacuum:
The VACUUM command reclaims space and re-sorts rows in your table. This is important for optimizing performance by removing deleted rows, reorganizing data for better storage efficiency, and updating statistics used by the query optimizer.
- Analyze Compression:
The ANALYZE COMPRESSION command analyzes the compression ratios for each column in your table. It provides insights into how effectively each column is compressed and identifies opportunities for further compression optimizations.
- Review Analysis Results:
When reviewing the analysis results, focus on columns where the storage savings from compression are significant or where the current compression settings may not be optimal. Look for patterns such as high redundancy or low variation in column values that indicate potential for better compression.
- Alter Columns:
Based on the analysis, use ALTER TABLE queries to make changes to columns. This may involve altering data types and sizes to reduce storage requirements, as well as adjusting compression encoding to improve compression ratios. For example, using zstd compression for VARCHAR columns with limited variability can often yield good results.
Alter Datatype and size example: ALTER TABLE schema_name.table_name ALTER COLUMN column_name TYPE VARCHAR(256);
Change compression encoding: ALTER TABLE schema_name.table_name ALTER COLUMN column_name ENCODE zstd;
- Specify SORTKEY, DISTKEY, DISTSTYLE (if needed):
If your analysis suggests that optimizing data distribution and sorting can improve query performance, use ALTER TABLE queries to specify sort keys, distribution keys, and distribution styles. This helps Redshift’s query planner make more efficient use of resources during query execution.
ALTER TABLE schema_name.table_name ADD DISTKEY (dist_column_name);
ALTER TABLE schema_name.table_name DISTSTYLE ALL;
ALTER TABLE schema_name.table_name ADD SORTKEY (sort_column_name);
- Repeat Vacuum and Analyze (optional):
After making alterations, it can be beneficial to repeat the VACUUM and ANALYZE COMPRESSION steps. This ensures that the table remains optimized and that any new data added reflects the optimized column configurations and compression settings. - The following table identifies the supported compression encodings and the data types that support the encoding.
Step 2: Run the ANALYZE COMPRESSION command.
Step 3: Observe the results and make changes accordingly.
Step 4: Repeat steps 1 and 2 as necessary.
Reference:
Compression Encoding: https://docs.amazonaws.cn/en_us/redshift/latest/dg/c_Compression_encodings.html
Alter table: https://docs.aws.amazon.com/redshift/latest/dg/r_ALTER_TABLE.html
Distribution Style: https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-best-dist-key.html
Sort-Key: https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-sort-key.html
Different Sort-Key: https://docs.aws.amazon.com/redshift/latest/dg/t_Sorting_data.html
Data Types: https://docs.aws.amazon.com/redshift/latest/dg/c_Supported_data_types.html
Vacuum: https://docs.aws.amazon.com/redshift/latest/dg/r_VACUUM_command.html
Analyze Compression: https://docs.aws.amazon.com/redshift/latest/dg/r_ANALYZE_COMPRESSION.html
Thank you so much for reading...
Comments
Post a Comment