![]() |
|
SAS Tip
of the Month Occasionally we need to split a dataset into parts -- one reason may be that the dataset is to large to send as one file but can be sent in parts. One of the tools I have to do this is a macro that counts the number of records in a dataset, then splits that dataset into a user defined number of records: %macro breakDataset(dataset=_last_ /*Input dataset*/
,numindset=100 /*Maximum number of obs in output datasets*/
);
*Count the number of records and calculate the number of datasets needed;
data _null_;
if 0 then set &dataset nobs=nobs;
call symput('nobs',strip(put(nobs,8.)));
call symput('numoutdset',strip(put(ceil(nobs/&numindset),8.)));
stop;
run;
*Split dataset -- output dataset name is OriginalDatasetName_number;
%do i=1 %to &numoutdset;
data &dataset._&i;
set &dataset (firstobs=%eval((&numindset.*(&i-1))+1) obs=%eval(&numindset.*&i));
run;
%end;
%mend breakDataset;
The following example creates a dataset X with 2500 records, and when the macro is run three datasets are created (X_1, X_2 and X_3) with 1000 observations for the first two and 500 observations in the last dataset. data x;
do k=1 to 2500;
output;
end;
run;
%breakDataset(dataset=x
,numindset=1000
);
run;
If you are looking for another solution, you may be interested in a paper called Splitting a Large SASŪ Data Set written by Selvaratnam Sridharma from the Census Bureau in Washington, D.C., presented at SUGI 28 back in 2003. See you in August. |
|
| ________________________________ Updated July 5, 2010 |
|