Kirill's SPSS Macros Page
Here you can find some original SPSS macros — small programs written in SPSS Statistics user command language, and useful for those who process data by this statistical package.
Since June 2005
Updated Aug. 11, 2017
Some instructions and information
Kirill’s SPSS macros page nests a separate corner on spsstools.net, the greatest SPSS programming resource, owing to Raynald Levesque (creator) and Anton Balabanov (director). Despite being a part of the site the page is “stand-alone” and is directed by its own creator, Kirill Orlov.
Please do not publish any of these macros themselves or their description documents without the consent of the author. But use them freely. Please make a credit note when you share them or report their usage.
In the descriptions that you download with the macros not everything is in English. Macro calls and most principal other portions have been annotated in English (for the most part, this is enough to use a macro). The rest of the description text that may contain nuances is still in Russian. I hope to translate it later.
Slight revisions of the macros can pass without notes of update on the page or in the descriptions. Please don't hesitate to report any bugs you've found or to bring ideas or proposals. I’ll be glad with your feedback.
Here are the collections of macros
Recoding categorical variables into binary or vice versa. Collection of macros for conversion categorical data into binary data or back; for example, categorical multiple response set (MRC) or dichotomous multiple response set (MRD) – one type into another. Such a need emerges frequently during processing of a survey data.
Tools for multiple response sets. One macro is appoined to fix a categorical multiple response set (MRC). Another macro provides dichotomous multiple response sets (MRD) with “no answer” variables. One more macro enriches or impoverishes data of a categorical multiple response set consulting with other variables with the same response list. A pair of other macros create a multiple response set out of a string variable (it can be handy to enter responses for a multiple choice question first into one string variable).
Tools for series of items. Collection of macros for a “simple matrix question”, i.e. a series of variables with a common pool of alternative responses (Single response series, SRS); for example, a set of items each scored by rating scale or ranked. One of the macros is for the data respondents ranked and it shifts the variables into the categorical multiple response set or back. Another macro is intended for more general tasks of translating values and variables into each other as well as for calculating on reduplicating values. The third macro is for a situation when respondents rated not all items but only those they had chosen before, and the rating data having been entered in a packed (quickened) mode.
Some horizontal operations. Collection of macros performing some wanted things (such as sorting, ranking or counting up unique values) within cases, horizontally. The input file remains fully safe because transposing is not applied.
Derandomizing of tasks. If same tasks (some stimuli, e.g. questionnaire questions, specimens being tested, or medical treatments) were offered in different sequence to different respondents, so that the data, too, were then entered in that order of exposure – “order of trials” – then the macro will restructure these data into a unified “order of tasks” wherein each variable contains data of only one task.
Weighting groups. Achieving wanted proportion sizes of respondent groups by univariate or multivariate (rim) weighting. You can select total N, impose restriction upon weighting individual cells or cases, weight several subsamples in parallel, take account of initial weights.
Categorical into Contrast (to be added)
Categorical variables into contrast variables. Creates contrast variables from categorical variables (of 3 types to choose) and their interaction variables. Contrast variables are needed first of all when one has to analyse influence of qualitative factors by methods designed for quantitative input (e.g. linear regression).
Various proximity measures. Calculation of some pairwise measures of proximity or association (similarities, distances, correlations) absent in SPSS. Among them are Gower similarity for comparing respondents by quantitative and qualitative characteristics at once; Canberra distance which is optimal for comparing respondents based on their responses to a ranking question; tetrachoric and biserial coefficients of correlation.
Differences inside or between matrices. Macros compute a matrix of distances between matrices of proximity coefficients (rather than between variables or cases), - such as correlation or distance matrices; or between columns inside such matrices. These comparisons can help a researcher: for example, before a cluster or a factor analysis.
Fitting variables to a matrix of coefficients. The macros modify variables’ values so that the variables have strength of relations according to a user-specified matrix (correlation, covariance, or cross-product). Option of insurance against heteroscedasticity allows to achieve homoscedastic relationships.
Cumulative curves. Macros that are related to analysis of cumulative distributions. One of them comparing, via cluster analysis, subsamples by shape of cumulative distribution in variables. Another macro – for marketing – analyses data of the so called price sensitivity meter (PSM).
Clustering criterions. Computation of indices, such as Calinski–Harabasz, Davies–Bouldin, Ratkowsky–Lance, C-Index, correlation, Gamma statistic, Dunn, Silhouette statistic (several types), AIC, BIC, helpful in choosing the better classification partition, specifically, to decide how many clusters one should extract in a cluster analysis.
Euclidean corrections and convertions. Macros for matrices of proximities that must be layed in euclidean or metric space. You can convert similarities (of a covariance/correlation type or interpretation) geometrically correctly into distances or vice versa; correct similarities or dissimilarities not fully satisfying space to ones satisfying it.
Instruments facilitating work. Macros that are not connected with specific analysis or processing but rather serve to speed up various kind of job through syntax. One of them is an alternative to “SPSS Production Facility”, accelerating production of tables etc.
Regular clouds. Creating multivariate data with regular, nonrandom structure. In particular, such data can be understood as fully no-clustered, unlike data generated randomly. Useful as model data in exploration of habits of one or another statistical algorithm, for example of cluster analysis.
Random cluster/mixture data. Creation of random data consisting of clear clusters or mixtures (fuzzy clusters). Can make these clouds round or elongated, gaussian or platykurtic, regulate sizes and bodily closeness among them. A separate macro randomly rotates data in space.
Neighbourhood chains. Out of data showing pairwise relationships within a set of objects there is extracted the information about which object is referred to “in the first place” or “most strongly” by each given object. This way, a trajectory of sequential references is being built. It is shown in form of a table (adjacency list) and a dendrogram.
Pairing cases of two samples. Between two samples or sets optimal pairing is being done, such that the sum of within-pair differences gets minimized. Being used is Hungarian Algorithm for matching elements from two arrays into pairs.
Procrustes analysis. Procrustes analysis for two configurations finds a way to maximally superpose two clouds of points in space, provided that a point in one cloud is designedly correspondent with a point in the other. Residual amount of mismatch tells of initial degree of non-identity of configurations. The analysis is used in tasks of comparing shapes and juxtaposition of ordinations (for example factor loading matrices – for detecting identical factors).
Adding latents as lines to data cloud. The macros show on scatterplot of data their principal components or discriminants – in a form of lines tiled with points, these latents’ scores.
Imputation of missing data. The macros perform hot-deck imputation of missing values, borrowing valid values from cases which are similar to cases with missing data by some background characteristics. A separate macro performs an arbitrary, user-defined borrowing of values from some cases by other cases.
Functions for MATRIX — END MATRIX. Collection of useful statistical, mathematical, restructuring and other functions for matrix session in SPSS.
Clustering. Macros for hierarchical cluster analysis (with options of constraint for a preexistent structure, precocious stop of agglomeration, and other), for computation of distances between already available groups/clusters and for assigning new objects to them. Macro for initializing cluster centres in K-means method clustering.