Data Dictionaries and Frequency Distributions from Multiple .sav Files in a Folder — to Excel
This is a python code integrated into SPSS syntax within BEGIN PROGRAM - END PROGRAM structure.
What it does? It gets current syntax folder and opens every .sav file in this folder in turn. For every opened file it executes DISPLAY DICTIONARY command and writes output to Dict sheet into Excel file, located in the same folder.
Then it builds Custom Tables frequency table for every variable in a file and writes output to Excel as well (to the Ctab sheet). Then it closes .sav file, then repeats.
NB! Excel file is erased first, if it exists (output.xlsx by default).
NB! Custom Tables feature needs to be installed to use this syntax as is.
Author - Anton Mityushin, mitanton@yandex.ru.
1 2 3 4 5 6 7 8 | * Encoding: UTF-8. SET OVARS=BOTH ONUMBERS=BOTH TVARS=BOTH TNUMBERS=BOTH. SET PRINTBACK ON. SET UNICODE ON. OUTPUT CLOSE ALL. BEGIN PROGRAM Python3. |
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | import os import spss, SpssClient mask=".sav" output_filename = "output" # Specify the desired Excel output file name def Ctab (): """ This function constructs Custom Tables procedure code to be executed """ nl="\n" tb=" "*4 varsForProc = "" varsForTables = "" tblCat="[C][COUNT F40.0, UCOUNT F40.0, COLPCT.COUNT PCT40.1, COLPCT.VALIDN PCT40.1]+" for i in range(spss.GetVariableCount()): varsForProc+=(tb*3)+spss.GetVariableName(i)+nl varsForTables+=(tb*3)+spss.GetVariableName(i)+tblCat+nl varsForTables=varsForTables.replace(nl+tblCat+nl,nl+"") varsForTables=varsForTables[:len(varsForTables)-2] ct="CTABLES" + nl + \ "/TABLE" + nl + \ varsForTables + nl + \ "/CATEGORIES VARIABLES=" + nl + \ varsForProc + nl + \ "[OTHERNM, MISSING] EMPTY=INCLUDE TOTAL=YES POSITION=AFTER." return ct def treatFolder (path, ext, func): """ This function iterates .sav-files in a folder and calls syntax execution on every file. You may note that SpssCmd function is passed as a parameter to this function. """ files = sorted(os.listdir(path)) for file in files: if file.endswith(ext): func(file) def SpssCmd(file): """ This function drives the syntax to be executed on every file. Note, it uses Ctab function output as to call CTABLES syntax. """ spss.Submit("GET FILE = '" + file + "'.") spss.Submit("OUTPUT NEW NAME=Dict.") spss.Submit("DISPLAY DICTIONARY.") spss.Submit("OUTPUT EXPORT /XLSX DOCUMENTFILE='%s.xlsx' OPERATION=MODIFYSHEET SHEET='Dict'." % output_filename) spss.Submit("OUTPUT CLOSE *.") spss.Submit("OUTPUT NEW NAME=Ctab.") spss.Submit(Ctab()) spss.Submit("OUTPUT EXPORT /XLSX DOCUMENTFILE='%s.xlsx' OPERATION=MODIFYSHEET SHEET='Ctab' LOCATION=LASTROW." % output_filename) spss.Submit("OUTPUT CLOSE *.") spss.Submit("DATASET CLOSE *.") # Above we just defined the necessary functions. The action starts here. SpssClient.StartClient() sDoc = SpssClient.GetDesignatedSyntaxDoc() filePath=sDoc.GetDocumentPath() # Here we get folder where current syntax file is located spss.Submit("CD '"+filePath+"'.") # ... and make it our working directory spss.Submit("HOST COMMAND = 'DEL %s.xlsx'." % output_filename) # Here the Excel file is deleted if it exists path=os.path.dirname(filePath) treatFolder(path, mask, SpssCmd) SpssClient.StopClient() |
77 | END PROGRAM.
|
Related pages
...
Navigate from here