sql - Amazon Redshift doing Hash Join even when joined on column that is both Dist Key and Sort Key -
i have fact table in redshift having 1.3 billion rows distribution key c1 , sort key c1, c2.
i need join table join clause on c1 (i.e. c1 1st instance of table = c1 2nd instance of table).
as see query plan of query, redshift appears doing hash join ds_dist_none. though ds_dist_none expected have both dist key , sort key on column c1, expected redshift merge join instead of hash join (again because of same reason).
i believe slowing down query.
can please explain why redshift may doing hash join instead of merge join (even though have both dist key , sort key on joining column) , redshift doing ds_dist_none query?
it turned out since not inserting data in our table in sorted order (as defined sort key of table) , redshift not automatically keep table's rows sorted sort key, there no way redshift perform merge join on our table. after running full vacuum on table, redshift started performing merge join
Comments
Post a Comment