Hi,
In one of my packages, I have join data from 2 data sets. The problem is 'Join' requires both dataset to be sorted and I can't sort one dataset without 'sort' transformation. The real problem is since I have to sort 65million records, the sort operation sucks always. The other option is using 'Lookup', but since I have to do 'Range Lookup', it is taking hours and hours to process less # of records.
My question is, is there any other way, I can do join of 2 datasets without sorting

How to join 2 datasets without sorting
pjmvn
Correct. If the data is not sorted when it enters the pipeline then you will need to SORT it prior to the MERGE JOIN.
-Jamie
elemr
HTH, Jens Suessmeyer.
Malmer
Jamie - I don't think it is completely true. Merge join definitely needs ordered data sets to do the join but that is not the case with the loop or Hash match.
So I think having to sort the data for SSIS (Be it in DB or using Sort Transformation) is an extra overhead that one could live without. Infact in cases where the dataset is reasonably large this (sort) is a bottleneck.
Bob_Sullivan
Any join operation requires a sort. This is not specific to SSIS. If you do a join in a relational database then under the covers sorting will occur. The difference in SSIS is that you explicitly do the SORT yourself. This is a GOOD thing because if the data is already sorted then you have no need to do the SORT.
-Jamie
chaza
the fact that something can be done in ssis doesn't necessarily mean that it ought to be done. sorting millions of records in ssis is probably a bad idea unless the machine executing the package has many processors and a ton of memory. an alternative to using ssis to sort records is to load the records into a sql table and let the database engine sort the records.
Kamen
I'm not sure about avoiding sorting, but if you Must sort Large data and your performance is bad, take a look at this:
"High Performance Sorting for SQL Server Integration Services"
http://www.ordinal.com/ssis.html
Phantisy
Just curious as to why we need to sort it when I don't really need it to be
I don't necessarily care for Merge Join I could be happy with Nested or Hash.
ianpender
I have the same question, so does that mean that there is no way to join without sorting
dlevers
Hi,
Thanks for your reply. My problem is I am using a script component for one of the dataset and its 100% sure that the output can not be in sort order. So, if I set the 'IsSorted' property of the script component output to 'true', then I am getting very few records which are the output of the first buffer only.