Are you aware of the book SPSS Programming and Data Management?

Don't satisfy yourself with the Graphic User Interface (GUI)!

The GUI is fine (I use it every day); however, using syntax in addition of the GUI can easily increase productivity by a factor of 5 to 10 times for simple jobs. The increase can easily be 50 times or more for larger, complex jobs. Furthermore some of SPSS's features are only available through syntax. As a "bonus", syntax files work on all versions of SPSS, not just on Windows.

There is something for everybody in the sample syntax's included here:

Suggestions and code contributions are welcomed. Share what you know! Learn what you don't!


The syntax files are broadly classified by purposes as follows:

Caveats and Suggestions

I have not necessarily checked each and every file found here. I have grabbed some files which looked interesting but I might not have had the time to review them up to now. So much code… so little time…

When the information is readily available, I show the name of the authors of syntax I did not write. Usually, I do not show email addresses in order to reduce the number of emails to authors.  If somebody objects to having his / her code or name listed, please send me an email and I will quickly remove the reference. If, on the other hand, you send me code which would be useful to other visitors, I will gladly include it here with due credit. Such code should contain dummy data (ideally using DATA LIST or INPUT PROGRAM) and a description of its purpose.

If you don't measure it… you can't improve it!


Area under the curve (AUC)

  1. Area under the curve using Trapezoidal Integration.sps
  2. Incremental Area under the curve.sps

Batch files

  1. Example of bat file running an sps file
  2. Run syntax from batch file or command line.sps

Block Designs (with thanks to Valentim R. Alferes)

  1. Completely Randomized Designs.sps (equal or unequal n per treatment)
  2. Random assignment of units to experimental treatments.sps This is for Randomized Block Designs (Simple & Generalized) and Completely Randomized Designs (equal n per treatment)

Bootstrap and random numbers

Tip: any time you use random numbers and need to be able to reproduce your results, use "SET SEED=number." at the beginning of your syntax where 'number' is any 'random number' you come up with. One option is to use the current date & time (e.g. "SET SEED=1120722." if it is 7h22 on Nov 20th)

  1. Bootstrap confidence interval for the variance of a variable.sps
  2. Bootstrap confidence interval for Cronbach alpha.sps
  3. Bootstrap crosstab.sps
  4. Bootstrap ordinary least square (OSL) estimators.sps
  5. Bootstrap the mean and median.sps
  6. Generating multinominal random variables.sps(AnswerNet)
  7. Generating multivariate hypergeometric random variables.sps (AnswerNet)
  8. Generating multivariate normal variables with a specific covariance matrix (AnswerNet)
  9. Generate random triad numbers.sps
  10. Get random sample of various size then calculate statistics.sps (Compare means of n samples of size s1 s2 ... sn ...)
  11. Get various random samples of same size calculate statistics.sps (Compare means of n1 n2 ... nn samples of size s)
  12. Sampling distribution of the correlation between 2 variables.sps 

    Bootstrapping using OMS (this requires v12)
  13. oms_bootstrapping.sps

Charts and Tables


  1. Bar charts for school types by sex where percentages of each sex add up to 100 percent.sps
  2. Blank bar for unselected category.sps (from AnswerNet) 
  3. Blank bar for unselected category (generalized).sps (note however that to show empty categories in CTABLES is trivial. All that is required is to specify EMPTY=INCLUDE in the /Categories subcommand) 
  4. Compare (superimpose) two histograms.sps could also use a population pyramid (see IGRAPH section below for an example)
  5. Count outliers.sps (show number of outliers in a boxplot)
  6. Do bar charts excluding categories with small number of cases.sps
    This macro is fully commented here.
    Newbie's who do not know how to use a macro should read this explanation.
  7. Do many histograms with the same axis boundaries.sps (this demonstrates how the use of the macro Dograph.sps and template Dograph.sct to produce many graphs with the same x and y scale.
  8. Graph cumulative percentage retired at attained age by categorical variable.sps
  9. Graph cumulative percent on X axix.sps
  10. Graph survey question.sps
  11. Histogram with percent on y axis instead of numbers.sps
  12. Identify your own data in the chart.sps
  13. Identify your own data in the chart version2.sps This is a generalization of the above syntax. It uses the following chart template Identify Your Own Data.sct (right-click on this template and select save target)
  14. Print current date in chart title.sps
  15. Print current date and time in chart title.sps (Same technique can be used with Tables)
  16. Print histogram or bar chart depending on data.sps(A good macro example)
  17. Print school names as part of graph titles.sps
  18. Show mean values in line graph.sps
  19. Show 2 categories on same histogram.sps
  20. ZIPF law and graph.sps


  1. Construct a table "manually" in the data editor.sps(A good example of data restructuration)
  2. Construct a table "manually" example no 2.sps
    (non trivial code...)
  3. Find population frequency when multiple response with long strings.sps
  4. List variables in frequency table by order of medians.sps
  5. Print actual name group and id in heading of each listing.sps
  6. Print mean plus minus standard deviation in Table.sps
  7. Put 4 variables in the same frequency table.sps
  8. Show empty category in tables.sps (from AnswerNet) Note: This is trivial with CTABLE)
  9. Show empty categories in tables (second method).sps
  10. Show mean values.sps
  11. Show number of valid cases in table footnote.sps
  12. Table where list of variables is generated by macro.sps (Illustrates the !IF ...!ELSE ... !IFEND macro command)


  1. Get statistics for grouping of variables.sps
  2. Sort categories by decreasing count but with Others as last one.sps
  3. Using Macros and CTABLE.sps

Cluster Analysis

  1. Cluster analysis using similarity proximity (count) data as input.sps
  2. Save centers of Hierarchical cluster analysis as initial value of K-means.sps

Combinations, Permutations, Interactions

  1. All combinations of 3 numbers out of n.sps (see "Find all Combinations .." below for a generalization)
  2. All combinations of 3 letters out of n.sps (with replacement)
  3. Calculate interaction terms between 2 categorical variables.sps (within a regression context)
  4. Create a new variable for each combination of 2 variables.sps
  5. Find all combinations of 1 up to n items out of m items.sps (high power stuff!)
  6. Find all combinations of n items out of m items.sps (high power stuff!)
  7. Find all permutations of integers 1 to n.sps Maximum value of n is 7. Combined with recode, this can find permutations of any strings or numbers.
  8. Generate orders for block of trials.sps
  9. Get all possible crossproducts of pairs of variables.sps (contains a fair amount of comments)


  1. Automatically compute sample weights to approximate population.sps
  2. Box-CoxTransformation.sps To transform var1 using each of the 31 values of lambda that are between -2 and 1 (increments of 0.1).
  3. Count number of distinct values across 400 variables.sps
  4. Compute percentage of patients having each fracture category.sps 
  5. Compute z = x / max( y) where max( y) is over all cases.sps (it is sometimes preferable to use this macro technique)
  6. Compute distances between 2 points on earth.sps (with thanks to Simon Freidin)
  7. Compute average of m variables where m is a variable in the data file.sps
  8. Create a new variable equal to mean of an other variable.sps
  9. Find the cubic root.sps  
  10. Reverse the digits on an integer.sps
  11. Weight data based on 2 or more vars.sps With thanks to Jo?o Duarte!

Concatenate/modify string variables (see also Parse or flag data)

  1. Apparent problem with concat.sps Newbies should take a look at this example.
  2. Combine a string variable and a numeric variable.sps
  3. Concatenate.sps (new string equals concatenation of values in second variable)
  4. Concatenate content of cases with same id.sps  
  5. Concatenate numbers.sps
  6. Concatenate 22 variables.sps
  7. Convert first letter of each word to upper case.sps (Thanks to A. Paul Beaulne for sending me this code)
  8. Create an id using name and dob.sps
  9. Normalise alpha.sps (Capitalise the first letter of each word, use lower case for the other letters)
  10. Normalize string.sps (delete spaces at beginning, remove period at end, capitalize all letters)
  11. Remove initial from name.sps
  12. Remove period from string.sps(can be modified to remove any other characters)
  13. Reorganize names.sps (place family name at the beginning of the sting)
  14. Transform ascii codes into characters.sps

Data Editor

  1. Reduce size of columns in data editor.sps  
  2. Right align strings in data editor.sps

Data validation

  1. Perform tests on ssn.sps
  2. Validate likert and continuous values.sps

Dates and time (see also the Dates, Time and Age Tutorial)

  1. Add 60 days to a date then find end of that month.sps 
  2. Add leading zeros to a string date.sps
  3. Ages are in nnH nnD nnM and nnA.sps 
  4. Break down number of days in hospital by calendar month.sps
  5. Calculate age.sps
  6. Calculate time differences to milliseconds.sps 
  7. Calculate mean date and standard deviation in days.sps
  8. Calculate nb of days within the eligibility period.sps
  9. Caculate number of minutes between 2 timestamps (crossover midnight).sps  
  10. Calculate number of months between 2 dates.sps
  11. Calculate waiting time when time is coded in hh min.sps
  12. Compute number of weekdays between 2 dates.sps
  13. Compute number of weekdays excluding public holidays.sps
  14. Compute sleep time.sps
  15. Convert basis.sps
  16. Convert strings into numbers.sps (variable contains age in either of the following format "7 Y" for 7 years, "3m" for 3 months, "28D" for 28 days. Need to convert these to years.)
  17. Convert string formated as hhmmss into numeric time variable.sps (thanks to Jim Marks)
  18. Convert string 01jan1992 to a date variable.sps
  19. Convert string 1997-08-22 into a date variable.sps
  20. Convert string into date and time variables.sps  
  21. Convert string "2006-04-28 18:20:01" to datetime format.sps
  22. Convert string to date and select cases which fall during the weekend.sps
  23. Convert string "04Apri03" to a date variable.sps
  24. Date plus 3 months.sps
  25. Dates appear as asterix on chart.sps (solution)
  26. Extract time portion from string variable containing date and time.sps
  27. From AM PM to military time.sps
  28. Importing from excel (convert days into dates).sps
  29. Keep time portion of date when creating Tab delimited file.sps
  30. Make variable equal to current date.sps
  31. Number of consecutive 30 minutes of hypoxia.sps
  32. Print current date as part of graph title.sps
  33. Print date and time before a procedure.sps
  34. Print day name along with date.sps
  35. Read time stamp.sps
  36. Save data file with current date as part of name.sps
  37. Bayes estimates for proportions and their CI.sps with thanks to Evgeny Ivashkevich (this also calculates Confidence Intervals for a category not present in the sample)
  38. Calculate Chi-square significance given q and df.sps 
  39. Calculate 95 percent confidence interval for the median.sps (thanks to Marta Garcia-Granero)
  40. Calculate McNemar Chi-Square test.sps(thanks to Marta)
  41. Hodges Lehmann Confidence Interval for Median difference.sps (thanks to Marta Garcia-Granero)
  42. Exact confidence limits for a binomial parameter.sps 
  43. Goodness of fit test for Poisson Distribution.sps (thanks to Marta Garcia-Granero)
  44. Normalization of raw scores (with thanks to Valentim R. Alferes)
  45. Proportion tests and confidence intervals.sps (thanks to Gwilym Pryce) This includes large-sample
    - significance test for a single population proportion
    - confidence interval for a single population proportion
    - significance test for two population proportions
    - confidence intervals for comparing two population proportions
  46. Testing linear constraints in MR.sps with thanks to Johannes Naumann. This macro tests General Linear hypothesis of the type cb=d, where b is a vector of regression coefficients and c is a matrix of linear constraints.
  47. Univariate and multivariate tests of skew and kurtosis (a link to Lawrence T. Decarlo's SPSS macro). The same site also contains
    - SPSS macro for Mardia's multivariate skew
    - SPSS programs for signal detection models expressed as generalized linear models 

    Fitting distributions
  48. Fitting models with overdispersion or 'extra-Poisson' variation.sps

Export_Import (see also FAQ and Sample Scripts)

  1. Export all tables in word.sps (see Sample Scripts for an automated solution)
  2. Export data and value labels to excel.sps 
  3. Export content of data editor to a specified sheet of an existing Excel workbook.sps
  4. Export from SPSS to ACCESS.sps
  5. Export from SPSS to ACCESS (method2).sps
  6. Export more than 256 vars to Excel.sps
  7. Export some SPSS vars to many sheets of Excel workbook.sps
  8. Import from ACCESS or LotusNotes.sps (no DSN needed: this is very handy. Thanks to Tom Dierickx)
  9. Writing back an SPSS 10 file to an ODBC database.sps (from AnswerNet)

Factor Analysis

  1. Determining the number of components using parallel analysis and Velicer's MAP test (a link to Brian P. O'Connor's site)
  2. Factor analysis with Spearman correlation through a matrix.sps

Flag or Select Cases

  1. Exclude "outliers" from analysis.sps (where outliers are defined as cases outside Mean +/- 2 SD)
  2. Flag cases where a given string variable contains a given word.sps  
  3. Flag cases where any of a list of variables have same value.sps
  4. Flag cases where salary is in top 95 percentile.sps  
  5. Flag cases meeting a certain condition as well as preceding and following case for the same person.sps
  6. Flag first and last dates (within each ID).sps
  7. Keep only duplicate cases.sps
  8. Print frequency table of the n most (less) frequent items.sps
  9. Select patients where drug1 was given before drug2.sps
  10. Select cases where same letter appears twice in string.sps
  11. Sophisticated search in string variable.sps (data were scanned, portion of strings include letters (eg B) instead of numbers (eg 8); this syntax flags the errors)

IGRAPH (see also the corresponding script section)

  1. Clustered bars with percent based on total in cluster.sps
  2. Example of surface plot.sps
  3. Graphing an arbitrary function.sps
  4. Graph showing interaction in multiple regression.sps
  5. How to speed up IGRAPH.sps (A similar approach could be used for other type of graphs)
  6. Population pyramids.sps
  7. Produce long IGRAPHs.sps
  8. Separate box plot graph for each category value.sps (syntax can be adapted to any other type of graph)

Item Analysis

  1. Syntax for item analysis.sps This is based on SPSS's White Paper on Item Analysis and on this Exercise data file.
  2. Syntax For Item Analysis V6.sps This is a much improved version of the above. It is fully automated and has been developed and tested using SPSS 15. 

Labels, Variable Names and Format

  1. Add (or replace) a character at the beginning of each var names.sps
  2. Add '_99' at the end of every variable names.sps
  3. Apply lab1 as value label to var1 by syntax.sps
  4. Assign same label to many variables.sps
  5. Assign value labels to a vector.sps
  6. Assign variable and value labels of a given variable to other variables.sps
  7. Automatically rename variables.sps
  8. Auto variable renaming or copying.sps  
  9. Change case of Var Labels and/or Value Labels.sps with thanks to Simon Freidin
  10. Change format of 600 variables.sps
  11. Convert variable format.sps (see also the following tutorial. If you are not familiar with macros, see this macro tutorial for newbies).
  12. Create dummy variables.sps (also called indicator or binary variables)
  13. Create dummy variables (AnswerNet).sps
  14. Create new variable equal to number of occurrences of var1.sps
  15. Define a global variable.sps (this is a useful programming technique)
  16. Define variable label by Macro.sps
  17. Delete all variable labels of a given sav file.sps
  18. DeleteListOfVariableNames But Some May Not Exist.sps
  19. Delete variables with all values equal to zero.sps
  20. Delete or reorder variable names (data fields).sps
  21. Delete many variable labels.sps
  22. Group data and define corresponding value labels.sps
  23. Define list of variables between two variables.sps (a macro Gem)
  24. Match label file with data file.sps
  25. Print variable labels and value labels in FREQ Tables.sps
  26. Read ASCII data variable name, value and value labels.sps
  27. Recode variables var1 becomes varx etc.sps
  28. Remove underscores from all variable names.sps (can be adapted to remove any other character)
  29. Rename variables.sps
  30. Rename all variables t2abc becomes t1abc etc.sps
  31. Rename var in file1 to names in file2.sps
  32. Reverse scale and value labels.sps
  33. Round and change format of all numeric variables.sps
  34. Show 0.45 instead of .45.sps
  35. Sort variable names by alphabetical order (AnswerNet).sps
  36. Sort variables by name in data file.sps (sent by A. Paul Beaulne)
  37. SortVariablesByAlphabeticalOrder.sps
  38. Write value labels to ASCII file (AnswerNet).sps
  39. Xpand vector names.sps

Matching data files

  1. Compare 2 data files.sps with thanks to Simon Freidin
  2. Create data file if double entries are equals.sps (where entries done by 2 different persons in 2 different files)
  3. Double entry check.sps
  4. Find errors in 2 files (data entered twice).sps
  5. Match one to many where key has 4 variables.sps
  6. Match 2 files using between-dates criteria.sps
  7. Merge 2 data files based on many to many relationship.sps


  1. Coder Reliability with Nominal Data.sps for version 14+ of SPSS with thanks to Pieter van Groenestijn. Here are related notes and references. Here are the syntax for version 13 and related notes
  2. Cohen's Kappa.sps with thanks to Brian G. Dates. This syntax provides complete information on kappa for any number of raters and categories.
  3. Example that reads, writes, creates and transforms matrices.sps
  4. Export variance covariance matrix to ASCII file.sps
  5. Export variance covariance matrix to sav file.sps
  6. Find inverse of a matrix.sps
  7. Fuzzy Crosstable using Matrix command.pdf with thanks to Ruben P. Konig
  8. Hierarchical sort in MATRIX.sps with thanks to Kirill Orlov
  9. Macro autogenerate initial data file.sps with thanks to Fernando Cartwright
  10. Matrix out in.sps
  11. Maximizing the trace of a matrix.sps This is high powered stuff. Need to test all permutations of rows in order to find the one which maximizes the trace. For a 7*7 matrix (the maximum size this macro will handle), there are 5,040 permutations to test.
  12. Read matrix data.sps
  13. Reliability analysis when input is a correlation matrix.sps
  14. Transform a matrix into a vector.sps

Meta Analysis (See also meta-analysis stuff by David B. Wilson)

  1. META-SPSS.ZIP An exhaustive set of syntax files written by Marta Garcia-Granero as well as sample data files and supporting documents. This is the Read Me First documentation.
  2. Meta Analysis: fixed and random effects models.sps (With thanks to Valentim R. Alferes) This SPSS syntax does a meta-analysis on a set of studies comparing two independent means. It produces results for both fixed and random effects models, using Cohen's d statistics. The user has a total of 10 modes for entering summary data.

Multiple responses

  1. Count unique occurences of a multiple response.sps
  2. Create dichotomous variables from multiple responses which are not in order.sps
  3. Multiple responses are encoded as comma separated letters.sps


  1. Crosstab Chi-square and Phi in same table.sps
  2. OMS and macros.sps 


Caution: Before replacing or deleting outliers, see the warning at the beginning of syntax # 3.
  1. Exclude cases over mean plus 2 times sd.sps
  2. Replace outliers by average of cases with same characteristics.sps
  3. Replace outliers by mean plus/minus n times sd.sps
  4. Winsorize a mean.sps

Parse or Flag data (see also String Manipulation Tutorial)

  1. Extract bits from an integer.sps 
  2. Extract portion of string.sps (string contains first and last name, want first 3 letters of last name)
  3. Extract portion of string starting with a digit.sps
  4. Extract Zip code from address field.sps
  5. Extract two numbers from a string.sps (e.g. string "120/90" becomes numbers 120 and 90)
  6. Flag if last characters of string are 'Esq'.sps
  7. Parse a string into one letter per variable.sps 
  8. Parse comma separated numbers.sps
  9. Parse data separated by slashes.sps
  10. Parse domain name from email addresses.sps  
  11. Parse comma separated strings then autorecode results.sps
  12. Parsing a variable which has embedded line feeds.sps (thanks to Bjarte Aagnes)
  13. Remove letter at end of string and convert remaining string to a number.sps
  14. Split a string variable into plaintiff and defendant portions.sps
  15. String variable contains items separated by a slash.sps (there is a variable number of items from one case to the next)
  16. Weed out letters in a string and create a number with remaining digits.sps

Random Sampling

  1. Complex sampling without replacement.sps
  2. Draw without replacement (random permutation of numbers).sps
  3. Generate random phone numbers.sps (the syntax uses the following data file)
  4. Find random pairs of cases for T-test.sps
  5. Find random pairs of cases with same characteristics.sps
  6. Flag n random cases within each subgroups.sps
  7. Get 2 independent samples meeting given criteria.sps
  8. Get 2 random samples same sex age education.sps
  9. Get n independent random samples of size m from same file.sps
  10. Get random sample of x% of each stratum.sps
  11. Get random sample of N cases from each stratum.sps
  12. Getting repeated sampling from same file.sps
  13. List of random cases id 10 per line.sps
  14. Match cases on basis of propensity scores.sps (this involves matching cases which do not match but are close to each other)
  15. Proportional sampling without replace.sps
  16. Proportional random sampling.sps
  17. Proportional sampling without replacement.sps
  18. Random sample n males and n females.sps
  19. Random samples with same age sex education.sps
  20. Random split a file in two files.sps
  21. Randomize a variable n times and keep each randomization.sps
  22. Scramble social insurance numbers.sps
  23. Select 2 cases from each group.sps
  24. Select random samples of each group.sps
  25. Split files in 2 random portions.sps
  26. Split a file into 10 random groups of equal size.sps
  27. Systematic fixed sampling.sps

Ranking,Largest values,sorting,grouping

  1. Aggregating with the median.sps
  2. Calculate cumulative sum of Var1.sps
  3. Calculate mode.sps
  4. Calculate number of distinct values within Case.sps
  5. Calculate z scores across variables.sps
  6. Code using percentiles of a subset.sps
  7. Compute percentiles for one variable and by one or more grouping variables.sps (with thanks to Tom Dierickx). Note that the percentiles end up in the data editor, not in the Output window.
  8. Compute percentages based on values of first case.sps
  9. Create n tiles based on percent ranges rather than on count.sps
  10. For each case, find the earliest case in the preceding 7 days.sps (relatively complex stuff)
  11. Find 5 largest values within case.sps
  12. Find last 2 scores on repeated measure.sps
  13. Find multi-modal values per id.sps 
  14. Identify the highest 3 scores of each case.sps
  15. Identify variables having minimum value.sps (with thanks to Maciek Lobinski)
  16. New variable equals cumulative totals by id.sps
  17. Number consecutively cases with the same id.sps
  18. Random order.sps
  19. Rank equal intervals between minimum and maximum.sps
  20. Rank on basis of percentage of good.sps
  21. Rank variable names in alpha order.sps
  22. Rank within cases.sps
  23. Replace missing by median values within the case.sps
  24. Round up to the higher point 5.sps
  25. Saving confidence interval for mean (within groups).sps
  26. Score a test with an answer key.sps (thanks to A. Paul Beaulne)
  27. Sorting values within cases (using the bubble sort algorithm).sps
  28. Syntax group data in bands.sps
  29. Various Algorithms to sort within cases.spsWith thanks to Kirill Orlov

Read, Write or Create Data


  1. Adding new cases using syntax.sps
  2. Add variable equal to function of an existing Var.sps
  3. A few simple examples of INPUT PROGRAM.sps (a short tutorial)
  4. Copy some variables from each record type 1 to add a new record of type 0.sps
  5. Create consecutive records at the end of the file.sps
  6. Create constants for each non missing date.sps
  7. Define new variables in empty data set.sps
  8. Define varx to vary.sps
  9. Duplicate cases n times where n is variable.sps (see also Expand Crosstab Data below)
  10. Expand crosstab data into original data file.sps (disaggregate data)
  11. Expand data x and y times.sps eg from a case where age=20, males=5   and females=6 want to create 5 cases with age 20 and sex=1 and 6 cases where age=20 and sex=0
  12. Fill the gaps when Aggregate has empty categories.sps  Syntax creates cases to fill the gaps
  13. Generate random dates.sps
  14. INPUT program (to generate a random data file).sps
  15. Insert missing cases (within id).sps
  16. Insert missing dates (within id).sps
  17. Printing date time in output.sps


  1. Example of data list.sps
  2. Example of INPUT program.sps
  3. Read a variable number of records per case.sps
  4. Read ASCII (logical case is made up of 5 rows of 10 cases).sps
  5. Read ASCII file using FILE TYPE.sps
  6. Read ASCII file using INPUT PROGRAM.sps
  7. Read ASCII file with a forward slash delimiter.sps
  8. Read ASCII file with comma or dash delimited data .sps
  9. Read ASCII with comma and dot separated decimals.sps
  10. Read ASCII file with comma separated data (within quotes).sps
  11. Read ASCII file with fixed and free data.sps
  12. Read ASCII file with FIXED Data.sps
  13. Read ASCII file with REPEATING data.sps
  14. Read comma delimited fields with commas inside quoted strings.sps
  15. Read comments between the lines of data.sps
  16. Read complex file.sps
  17. Read data files that has no carriage returns.sps (from AnswerNet) (data is just one long stream, with no separation between records or fields, and no carriage returns)
  18. Read data list free with consecutive commas.sps
  19. Read data produced by CGI script.sps
  20. Read data where each case has 4 numeric records and  a variable number of string records.sps   (this is illustrates the use of the REREAD command)
  21. Read data inline File Type MIXED Records.sps
  22. Read text file where n columns are to be ignored.sps (n is a variable which varies by file)
  23. Skip first 6 Records.sps
  24. Skip one line of data.sps


  1. Write comma or tab delimited file.sps
  2. Write frequency percentages to data file.sps
  3. Write missing values as a dot.sps
  4. Write special ASCII file.sps
  5. Writing value labels instead of values.sps

Regression, Repeated Measures

  1. Add casewise regression coefficients to data file.sps
  2. Breusch-Pagan & Koenker test.sps (thanks to Marta Garcia-Granero)
  3. Calculate predicted values (unianova).sps
  4. Chow test.sps
  5. Compare regression coefficients.sps (thanks to A. Paul Beaulne for sending me this code)
  6. Compare coefficients generated by various groups.sps
  7. Conditional logistic regression.sps
  8. Do all univariate linear and logistic regressions.sps (thanks to Marta Garcia-Granero)
  9. Do All-Subsets regressions.sps  
  10. Generalized Estimation Equations (GEE).zip (thanks to Terry Duncan, PhD, Oregon Research Institute). GEE is a macro for analyzing longitudinal data. The SPSS macro uses the GEE approach of Liang and Zeger (1986) to model longitudinal data for a general class of outcome variables including gaussian, poisson, binary and gamma outcomes.
  11. Logistic regression by macro.sps
  12. Multinominal Logistic Regression with split-sample validation.sps  (thanks to Maciek Lobinski for sending me this macro)
  13. Non linear regression (NLR) with variance of residuals as the loss function.sps (this is not trivial)
  14. Piecewise regression.sps (also known as "spline regression" and "piecewise polynomials")
  15. Regression calculates table of predicted values.sps
  16. Regression in a loop.sps
  17. Regression when holding out k cases.sps
  18. Regression with correlation matrix as input.sps
  19. Regression with normed weight.sps
  20. Repeated measures macro.sps
  21. Ridge regression.sps (this comes with SPSS)
  22. Testing individual regressors in logistic regression.sps
  23. White's standard errors full OLS and White's SE output.sps (thanks to Gwilym Pryce)
    See also the following related tutorials on Heteroscedasticity.
  24. White's test: calculate the statistics and its significance.sps (thanks to Marta Garcia-Granero)

Remove Characters, Duplicates or Variables

  1. Delete cases with offset cases.sps
  2. Delete double entries.sps (thanks to Maciek Lobinski) For instance if, for a given case, var1 equals var2, the syntax replaces var2 by sysmis.
  3. Find duplicates.sps
  4. Remove double quotes.sps
  5. Remove duplicate records.sps
  6. Remove unused variables from many files.sps
  7. Replace consecutive spaces in string by a single space.sps
  8. Replace character In string.sps (see also String Manipulation Tutorial)
  9. Save duplicates in a separate file.sps 

Restructure File

  1. Allocate dummy variables to 24 hours.sps
  2. Automated data transform from tall to wide.sps 
  3. Automated restructure v4.sps (thanks to Kevin Hynes) This example maintains a grouping factor while restructuring data from tall to wide.
  4. Automated restructure from long to wide.sps with thanks to Hillel Vardi. This is the sample data file used.
  5. Collapse empty variables within a case.sps
  6. Deduplicate cases while keeping all the information.sps (a cute little problem)
  7. Each variable occupies 5 rows of 10 columns.sps (an other nice little problem)
  8. Find beginning and end of continuous periods.sps
  9. From many to one example1.sps
  10. From many to one example2.sps
  11. From many to one with alpha data.sps
  12. From many to one with specific order of new variables.sps
  13. From one to Many simple.sps
  14. From one to many with indicator variable.sps
  15. Restructure data file example1.sps
  16. Restructure data file example2.sps
  17. Restructure data file example3.sps
  18. Restructure data file example4.sps
  19. Restructure from tall to wide (general solution).sps (non trivial macro code...)
  20. Restructure time periods to a time matrix.sps
  21. Restructure to calculate Kappa.sps
  22. Transpose (FLIP) string variables.sps
  23. Use former variable names as value labels.sps

    Following examples require version 12 or above
  24. VarsToCases and CasesToVars.sps

ROC Curves

  1. ROC curves & Youden's Index.sps The syntax also computes Likelihood Ratios and Kullback-Leiber distances (requires v12 or above).

Sample Size and Power

  1. Power analysis examples.sps  With thanks to Bruce Weaver
  2. Sample size for means.sps With thanks to Marta Garcia-Granero. This is a collection of several short macros that perform sample size calculations for confidence interval estimation and one sample / two samples tests for means (this last one with equal or unequal sample sizes).
  3. Sample size for proportions.sps With thanks to Marta Garcia-Granero. A collection of macros that perform sample size calculation for the estimation of one proportion and one or two samples hypothesis testing, as well as the calculation of the power of a test.
  4. Sample size for correlation hypothesis testing.sps thanks to Marta!

Self Adjusting Syntax (other examples are scattered throughout this site)

  1. Automated data transform from tall to wide.sps
  2. Choice of include file depends on data.sps
  3. End of macro DO LOOP comes from data.sps
  4. Execute selective portions of syntax.sps see also tips on INCLUDE command
  5. From 2 files to 1 cases per id.sps
  6. Syntax varies based on name of data file.sps


  1. Are all words present.sps This tests whether all words passed to the macro are present within a given string variable.
  2. Are all words present dichotomy vars.sps Similar to above but creates one dichotomy variables for each target word 
  3. Change all strings in data file to lower case.sps
  4. Convert numbers to strings.sps
  5. Convert string '250 million' into a number.sps (or '16 billion' etc)
  6. Convert string to numeric variable.sps
  7. Soundex Phonetic Comparison.sps

Survival Analysis

  1. Show 95pc CI for failure points on a survival plot.sps
  2. Survival Analysis Example.sps

Tests of Inequality

  1. Index of dissimilarity.sps (formulas from Negroes in Cities (1965) by Karl and Alma Taeuber)
  2. Many tests of inequality v5.sps   (this chart template is used by the syntax)

The above syntax (formulas come from ) calculates the following indexes:

   the ATKINSON index = DEMAND coefficient.
   the THEIL redundancy.
   the RESERVE coefficient.
   the D&R coefficient.
   the KULLBACK-LIEBLER redundancy.
   the HOOVER coefficient.
   the COULTER coefficient.
   the GINI coefficient.

The Lorenz curve is produced. The various indexes are plotted on the same graph when there is data for more than one year.  At the end, there are 9 examples about how to use the syntax.

Test if file or variable exists

  1. Check for existence of file.sps  
  2. Choice of include file depends on existence of a given variable.sps
  3. Get all string or all numeric variables.sps   (the 2 macros produced by this macro allow you to process all string or all numeric variables in the data file)

Time Series

  1. Gaussian Filter.sps 

Transform variable

  1. Automatically rescale variable to be between 0 and 1.sps
  2. Calculate utility of EuroQol 5D questionnaire.sps  with thanks to AJ Garcia Ruiz
  3. Constrain a variable to a given interval.sps(syntax is first given, then it is generalized using 2 macros)
  4. Convert numbers to string with leading zeros.sps
  5. Create variable equal to z-scores of an existing variable.sps
  6. Extract fist or first 2 digits of a large integer.sps 
  7. Global autorecode.sps A nice problem: Autorecode many string variables where the recode formula (eg a=1,b=2, etc) is the same for all variables even though none of the variables have all possible values
  8. Replace confidential information eg a ssn by a new (known) id.sps 
  9. Replace values higher than n by the mean of the other values.sps  
  10. Replace letter by 9999 then convert to number.sps
  11. Transform string coding into numbers.sps (5A becomes 5.1; 7B becomes 7.2; 9D becomes 9.4  etc)
  12. zip code With thanks to Christopher Boyd. This zip file contains 2 syntax files: one recodes zip into town; the other does the reverse. Zip codes are those used by the US Census to generate zip code level employment data.

T-Test or Means or ANOVA

  1. ANOVA A*B.sps (thanks to Valentim Alferes) This does an A*B Factorial ANOVA and calculates variance components, measures of association, measures of effect size and observed power. Works with raw data or published summary statistics.
  2. ANOVA_Tables using 4 methods.sps (thanks to Valentim Alferes) method 1:for Ns, Means and SDs; method 2 for Ns, Means and Variances; method 3 for Ns,Means and MS Error; method 4 for Means, Df num, Df den and MS Error.
  3. Cochran Hartley Critical Values.sps This gives the tabulated critical values at 5% and 1% for both HOV tests. Thanks to Marta Garcia-Granero
  4. Compare mean of each hospital with mean of all other hospitals.sps (nice little macro)
  5. Do a T-Test with only the Means, SD and Ns.sps (uses ANOVA)
  6. Do T-Test with only means,SD and Ns.sps (thanks to Marta Garcia-Granero) this includes Hartley's F test, the standard T-test and Welch test, asymptotic and non asymptotic 95% CI are calculated.
  7. Hotelling's T**2 & Profile Analysis.sps (thanks to Richard MacLennan)
  8. Multiple Mann-Whitney tests.sps (using a macro to have a procedure inside a LOOP)
  9. ONEWAY with summary data1.sps Performs a ONEWAY ANOVA plus several Homogeneity of Variances tests on summary data. Thanks to Marta Garcia-Granero
  10. ONEWAY with summary dataI2.sps Performs several ONEWAY ANOVAS plus several Homogeneity of variances tests on summary data. Any number of variables can be analysed. Thanks to Marta Garcia-Granero.
  11. Standardized effects size (Cohen Glass and Hedges's d).sps (with thanks to Marta) The effects size and their standard errors are added to the data file.
  12. T-Tests and Likert scales.sps  
  13. T-Test effect size non overlap and power.sps (thanks to Valentim Alferes) User can either analyse raw data or reproduce the SPSS T-Test standard output using summary statistics in published articles.


  1. Adjusted p-values algorithms.sps thanks to Marta Garcia-Granero for this improved version of her code. References are included.
    The code calculates adjusted p-values using the following 8 methods:
    One-step: Bonferroni and Sidak,
    Step-down: Bonferroni (Holm's), Sidak (also called Holm's-Sidak) and Finner.
    Step-up: Hommel, Hochbeerg and Simes
  2. Calculate average percent score.sps
  3. Calculations on dynamic columns.sps
  4. Canonical correlation.sps (this comes with SPSS)
  5. Fill in the gaps.sps (information in file has been left blank when it equals the information in the preceding case, this syntax fills the gap)
  6. Fill in the gaps (within ID).sps
  7. Interaction in factorial designs when dependent variable is not normal.sps Thanks to Marta Garcia-Granero for this code.
  8. Stop or resume generating outputs in the output window.sps

Working with Many Files (see also the corresponding scripts section)

  1. Combine many data files with same variables.sps
    alternative 1: the following script works even when file names are unknown
    alternative 2: Use the DOS command copy *.sps newfile.sps to combine all txt files in the folder (thanks to Scott Clark for this suggestion)
  2. Combine any number of consecutively named sav files 50 at a time.sps
  3. Combine many xls files into a single sav file.sps
  4. Combine 2 data files many to many.sps
  5. Data list is outside the main syntax.sps (illustrates how a syntax file can be modified by syntax)
  6. Delete cases contained in file2 from the main data file.sps
  7. Erase files.sps
  8. Example 1 using UPDATE command.sps
  9. Example 2 using UPDATE command.sps
  10. Get mean from 3 different files.sps
  11. Include 200 syntax files by macro.sps
  12. Keep only cases from Master file whose id are in second file.sps
  13. Macro to delete a list of files.sps 
  14. Many folders and many files.sps
  15. Process all xls files in (this scripts works jointly with this syntax file)
  16. Run a macro on several files.sps
  17. Run syntax on files whose names are derived from a data file.sps
  18. Run a macro on every file whose name is in a sav file.sps
  19. Save file1 file2 file3 etc by macro.sps
  20. Split big files into separate categories.sps (create a different sav file for each value of a numeric categorical variable)
  21. Split big files into separate categories string var.sps (create a different sav file for each value of a string categorical variable)
  22. Split file with kn cases into k files of n cases each.sps
  23. Unusual file merge.sps
  24. Show number of differences, if any, between 2 files.sps (to check double entry of data). For additional examples, see Matching data files

Working With Missing Values

Caveat: Replacing missing values is not something to be done lightly. David C Howell has a good page on the Treatment of Missing Data

  1. Conditionally replacing missing by mean.sps
  2. Conditionally replacing missing by mean example 2.sps
  3. Delete variables that have only missing values.sps
  4. Identifying the 3 types of missing values.sps
  5. Hot Deck.sps substitution of missing values of X within STRATUM (thanks to Theo van der Weegen)
  6. List variable names with missing values and identify main elements of cases.sps
  7. Listing Variables with missing values (Per case).sps with thanks to David Marso
  8. Mean substitution in additive scale.sps
  9. Missing values and DO IF.sps
  10. Recode certain dates as missing.sps
  11. Replace "Blanks" by value from preceding case.sps
  12. Replace missing by mean of category.sps
  13. Replace missing by median values within each case.sps
  14. Replace missing by random value taken from cases with valid value.sps (see Hot Deck.sps above  for a more general solution)
  15. Replace missing with mean.sps