![]() |
|
SAS Tip
of the Month Continuing on from last month, conversion of numeric to character, or character to numeric conversion is an issue. Lets take a look at the following example: 277 data TestDates;
278 infile cards;
279 input SubjectID $ 1-6 TestDate $ 8-15
280 @17 FirstTreatmentDate date9.;
281 put (_all_) (=);
282 cards;
SubjectID=000101 TestDate=06152008 FirstTreatmentDate=17198
SubjectID=000102 TestDate=08082008 FirstTreatmentDate=17332
NOTE: The data set WORK.TESTDATES has 2 observations and 3 variables.
285 ;
286 run;
287 data TestDatesModified;
288 attrib TestDateN length=8 format=date9.;
289 set TestDates;
290 TestDateN = TestDate; ** Convert TestDate to Numeric;
291 DaysSinceFirstTreatment = TestDateN - FirstTreatmentDate;
292 put (_all_) (=);
293 run;
NOTE: Character values have been converted to numeric
values at the places given by: (Line):(Column).
290:16
TestDateN=21AUG**** SubjectID=000101 TestDate=06152008 FirstTreatmentDate=17198
DaysSinceFirstTreatment=6134810
TestDateN=********* SubjectID=000102 TestDate=08082008 FirstTreatmentDate=17332
DaysSinceFirstTreatment=8064676
NOTE: There were 2 observations read from the data set WORK.TESTDATES.
NOTE: The data set WORK.TESTDATESMODIFIED has 2 observations and 5 variables.
As can be seen in the example, because we just blindly converted the TestDate variable from character to numeric, the values for TestDateN were strange. Note the log message above that gives an indication of the issue. To fix this we need to use and input statement with an informat, as the following example will show: 294 data TestDates; 295 infile cards; 296 input SubjectID $ 1-6 TestDate $ 8-15 297 @17 FirstTreatmentDate date9.; 298 put (_all_) (=); 299 cards; SubjectID=000101 TestDate=06152008 FirstTreatmentDate=17198 SubjectID=000102 TestDate=08082008 FirstTreatmentDate=17332 NOTE: The data set WORK.TESTDATES has 2 observations and 3 variables. 302 ; 303 run; 304 data TestDatesModified; 305 attrib TestDateN length=8 format=date9.; 306 set TestDates; 307 TestDateN = input(TestDate,mmddyy8.); ** Convert TestDate to Numeric; 308 DaysSinceFirstTreatment = TestDateN - FirstTreatmentDate; 309 put (_all_) (=); 310 run; TestDateN=15JUN2008 SubjectID=000101 TestDate=06152008 FirstTreatmentDate=17198 DaysSinceFirstTreatment=500 TestDateN=08AUG2008 SubjectID=000102 TestDate=08082008 FirstTreatmentDate=17332 DaysSinceFirstTreatment=420 NOTE: There were 2 observations read from the data set WORK.TESTDATES. NOTE: The data set WORK.TESTDATESMODIFIED has 2 observations and 5 variables. To avoid the issues of character to numeric conversion, please always use a INPUT statement with a informat to put the correct numeric value into a character variable. |
|
| ________________________________ Updated September 1, 2008 |
|