Bharat Guda
Bharat Guda

Reputation: 117

Split the text file into 3 data sets/ Tables

I have a data like below in text file.

How do i separate the text file into 3 data sets/tables?

1 with earnings data, 2nd with redemptions data, 3rd with expirations data. each one of them have lot of rows, i just mentioned only 3-4 rows for each one of them. I am trying to use Infile statement but not knowing how to split. Here is the thought: first, initial data(earnings) will be read and whenever sas identifies word redemptions it has to stop and rest of the data has to go to the 2nd dataset and Whenever sas identifies the word Expirations, the data below that keyword has to go to the 3rd dataset. Any suggestions ?

Earnings
abc 123 xyz abjjdd
bhb edw ajd jnjnjknn
ebc ecc cec cecekckk
....
redemptions
abc 123 xyz abjjdd
bhb edw ajd jnjnjknn
ebc ecc cec cecekckk
Expirations
abc 123 xyz abjjdd
bhb edw ajd jnjnjknn
ebc ecc cec cecd ccsdc
 djc c djc cjdcjjnc

Upvotes: 1

Views: 86

Answers (1)

Dominic Comtois
Dominic Comtois

Reputation: 10411

Using a retain variable will help you achieve this.

Using the code below, just replace the datalines in the infile statement with the file name and set the right infile parameters.

data rawImport;
  infile datalines dsd delimiter=' ' truncover;
  informat C1-C4 $32.;
  input C1-C4;
  datalines;
Earnings
abc 123 xyz abjjdd
bhb edw ajd jnjnjknn
ebc ecc cec cecekckk
Redemptions
abc 234 xyz abjjdd
bhb edw ajd jnjnjknn
ebc ecc cec cecekckk
Expirations
abc 345 xyz abjjdd
bhb edw ajd jnjnjknn
ebc ecc cec cecd ccsdc
djc c djc cjdcjjnc
;

By using a retain variable, we can now dispatch the lines to the appropriate datasets.

data Earnings Redemptions Expirations;
  set rawImport;
  length outputDS $ 12;
  retain outputDS;

  * Determine output dataset;
  if C1 = "Earnings" then do;
    outputDS = "Earnings";
    delete;
  end;
  else if C1 = "Redemptions" then do;
    outputDS = "Redemptions";
    delete;
  end;
  else if C1 = "Expirations" then do;
    outputDS = "Expirations";
    delete;
  end;

  * output to appropriate dataset;
  if outputDS = "Earnings" then output Earnings;
  else if outputDS = "Redemptions" then output Redemptions;
  else if outputDS = "Expirations" then output Expirations;

  drop outputDS;
run;

Log now shows:

NOTE: There were 13 observations read from the data set WORK.RAWIMPORT.
NOTE: The data set WORK.EARNINGS has 3 observations and 4 variables.
NOTE: The data set WORK.REDEMPTIONS has 3 observations and 4 variables.
NOTE: The data set WORK.EXPIRATIONS has 4 observations and 4 variables.

Upvotes: 1

Related Questions