Обработать множество файлов в одной папке
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | * Encoding: UTF-8. * Goal is to process all Excel files in a folder: read them one by one, and join data to a single SPSS .sav file. * Files are poor organized: data started in range different from A1:, have different number of rows in different files. * Some columns are empty. I know that my quantitative data is contained in first and third columns. * But as there are some text comments in the first rows, I prefer to read data first into string, and then convert to numbers. * Anton Balabanov, 20.09.2015. SET MPRINT=YES. DATASET CLOSE ALL. OUTPUT CLOSE ALL. * First, create 1-row dataset which will acquire data from every file in turn. DATA LIST LIST /V1(A500) V3(A500). BEGIN DATA. "Just to have large enough variable which will accept data from Excel files" "Another variable" END DATA. DATASET NAME MergedData. BEGIN PROGRAM. |
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | import os from spss import Submit folder = os.path.join('D:\\', '$cons', u'Тест', 'MS_data') Submit (r"CD '%s'." % folder) for file in os.listdir(folder): filename, fileext = os.path.splitext(file) if fileext == '.xls': Submit ( r""" GET DATA /TYPE=XLS /FILE="%s" /SHEET=index 1 /READNAMES=OFF. STRING source (A50). * This is to keep track from which file data came. COMPUTE source = "%s". EXECUTE. * This is to make merging possible, as I don't know in advance how wide string will be in a file. ALTER TYPE V1 V3 (A500). * These manipulations is to organize merging data into single dataset in a loop. DATASET NAME NewData. DATASET ACTIVATE MergedData. ADD FILES FILE=* FILE=NewData. EXECUTE. DATASET CLOSE NewData. """ % (file, filename)) |
52 53 54 55 56 57 58 59 60 61 | END PROGRAM. * Some postprocessing: convert to number and get rid from unnecessary variables. COMPUTE mz = Number(V1, F10). COMPUTE intens = Number(V3, F10). SELECT IF ~Mis(mz). EXECUTE. ADD FILES FILE=* /KEEP mz intens source. * Done! Save it. SAVE OUTFILE="MassData.sav". |