Friday, November 4, 2011

Importing Multiple Spreadsheets in SAS for Windows

What if you had to integrate data from a large number of spreadsheets scattered throughout a Windows file directory, including a number of subdirectories? Traversing a directory and collecting file paths might sound like a daunting proposal, but SAS can handle it. This post sketches out a possible solution in three steps.

Step one is to create a master list of all files in the directory, including all subdirectories:

filename DIRLIST pipe 'dir "C:\YOUR\FILE\LOCATION\" /s /b';
data dirlist;
     length buffer $256;
     infile DIRLIST length=reclen;
     input buffer $varying256. reclen;
run;

The above is a variation on SAS's solution to the problem, which is posted at http://support.sas.com/kb/24/820.html. The only difference in the above is that the MS DOS dir command contains /b, which allows the output table to populate with simple list of file paths. Read more about the dir command at http://www.computerhope.com/dirhlp.htm.

Step two is to create a subset of DIRLIST containing only the files to be accessed. If the desired files all have a common string in their file name, this PROC SQL statement will do the job:

PROC SQL;
CREATE TABLE myTable AS
SELECT * FROM DIRLIST WHERE buffer LIKE "%myString%";
QUIT;

If there is no common string in the file names, then your situation is more difficult. Nevertheless, chances are good that your data are all in one type of file, such as .xls, so you can create your subset of file paths based on whatever your extension may be:

PROC SQL;
CREATE TABLE myTable AS
SELECT * FROM DIRLIST WHERE buffer LIKE "%.xls%";
QUIT;

Step three is to loop through each one of the records in myTable, importing data from a single file on each pass. One way to create the outer loop is to adopt and adapt the SCANLOOP macro, which has been published at http://www2.sas.com/proceedings/sugi26/p093-26.pdf. A PROC IMPORT statement or DATA step (with INFILE statement) can be included within the SCANLOOP macro to pull in a file's data on each step - here is a possible example when each spreadsheet that you want to bring in contains two columns:

/* Filename statement to point to file to bring in */
/* LOC is macro variable containing file path */
filename MyImportFile "&LOC";
/* Import data into MyImportFile dataset */
DATA MyImportTEMP 
INFILE MyImportFile DLM=',|' MISSOVER DSD linesize=32767;
        /* Include needed number of variables */
LENGTH var1 $ 100 var2 $ 100;
INPUT var1 var2; 
RUN; 
/* Merge MyImportTEMP into MyImportALL dataset */
PROC DATASETS;
APPEND BASE=MyImportALL DATA=MyImportTEMP FORCE;
RUN; 

The above script will work when all of the files have some a common structure. If there is more than one type of file structure, one option would be to use one such PROC IMPORT or DATA step for each type, then use macro variables + IF/THEN logic to make sure that the correct one runs for each of your files.

Tuesday, August 30, 2011

SAS Macro to Import Excel Files into SAS

/***********************************************************
IMPORT EXCEL files into SAS
-works with both xls and xlsx files
-assumes all files to be imported are in the same folder

-will pull in the first sheet only; for others, use RANGE option in PROC IMPORT statement
August 30, 2011
http://sastipsbyhal.blogspot.com/
*/
%LET file1 = myFirstFile.XLSorXLSX;
%LET file2 = mySecondFile.
XLSorXLSX;

%LET file3 = myThirdFile.XLSorXLSX;
%LET file4 = myFourthFile.
XLSorXLSX;
%LET file5 = myFifthFile.
XLSorXLSX;
%LET file6 = mySixthFile.
XLSorXLSX;
%LET file7 = mySeventhFile.
XLSorXLSX;
%LET file8 = myEighthFile.
XLSorXLSX;
%LET fileName1 = sasNameFor_myFirstFile;
%LET fileName2 = sasNameFor_mySecondFile;
%LET fileName3 = sasNameFor_myThirdFile;
%LET fileName4 = sasNameFor_myFourthFile;
%LET fileName5 = sasNameFor_myFifthFile;
%LET fileName6 = sasNameFor_MySixthFile;
%LET fileName7 = sasNameFor_mySeventhFile;
%LET fileName8 = sasNameFor_myEighthFile;
%LET fileLoc = \\folder\where\txt\files\are\located\;
%LET fileMax = 8;
%MACRO excelIMPORT;
%Do i = 1 %to &fileMax. ;
PROC IMPORT OUT=WORK.&&fileName&i
            DATAFILE= "&fileLoc&&file&i"
            DBMS=EXCEL REPLACE;
     GETNAMES=YES;
     MIXED=NO;

       SCANTEXT=YES;
       USEDATE=YES;
       SCANTIME=YES;
RUN;
%END;
%MEND;
%excelIMPORT;



Thursday, July 21, 2011

Wednesday, July 6, 2011

How to create a SAS date from "YYYY-MM-DD HH:MM:SS"

Problem
How to pull a SAS date from a text field formatted as YYYY-MM-DD HH:MM:SS.

Solution
data myTable;
set myTable;
informat dateVar date9.;
format dateVar date9.;
dateVar = input(substr(temp_dt,1,10), yymmdd10.);
run;

Friday, June 10, 2011

SAS Macro to Import Tab-Delimited Files into SAS


/***********************************************************
IMPORT TAB-DELIMITED text files into SAS
June 10, 2011
http://sastipsbyhal.blogspot.com/
*/
%LET file1 = myFirstFile.txt;
%LET file2 = mySecondFile.txt;
%LET file3 = myThirdFile.txt;
%LET file4 = myFourthFile.txt;
%LET file5 = myFifthFile.txt;
%LET file6 = mySixthFile.txt;
%LET file7 = mySeventhFile.txt;
%LET file8 = myEighthFile.txt;
%LET fileName1 = sasNameFor_myFirstFile;
%LET fileName2 = sasNameFor_mySecondFile;
%LET fileName3 = sasNameFor_myThirdFile;
%LET fileName4 = sasNameFor_myFourthFile;
%LET fileName5 = sasNameFor_myFifthFile;
%LET fileName6 = sasNameFor_MySixthFile;
%LET fileName7 = sasNameFor_mySeventhFile;
%LET fileName8 = sasNameFor_myEighthFile;
%LET fileLoc = \\folder\where\txt\files\are\located\;
%LET fileMax = 8;
%MACRO tabdIMPORT;
%Do i = 1 %to &fileMax. ;
PROC IMPORT OUT=WORK.&&fileName&i 
            DATAFILE= "&fileLoc&&file&i"
            DBMS=TAB REPLACE;
    GETNAMES=YES;
    DATAROW=2; 
RUN;
%END;
%MEND;
%tabdIMPORT;