Redshift select * vs select single column

Question

I'm having the following Redshift performance issue:

I have a table with ~ 2 billion rows, which has ~100 varchar columns and one int8 column (intCol). The table is relatively sparse, although there are columns which have values in each row.

The following query:

select colA from tableA where intCol = ‘111111’;

returns approximately 30 rows and runs relatively quickly (~2 mins)

However, the query:

select * from tableA where intCol = ‘111111’;

takes an undetermined amount of time (gave up after 60 mins).

I know pruning the columns in the projection is usually better but this application needs the full row.

Questions:
Is this just a fundamentally bad thing to do in Redshift? If not, why is this particular query taking so long? Is it related to the structure of the table somehow? Is there some Redshift knob to tweak to make it faster? I haven't yet messed with the distkey and sortkey on the table, but it's not clear that those should matter in this case.

Redshift select * vs select single column

Answers (1)

Related Questions