1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
**QUESTION .

I have the following data that I want to put into vectors:

id      time1   time2
1       100     200
1       200     300
1       300     400
1       500     600
1       600     700
1       800     830
2       800     900
2       900     930

I want two vectors to include only the continuous observations for that id
(ie the first 3 time1 variables in the vector, the first 3 time2 variables
in another vector).  I don't know the number of continuous observations,
which would be the subscript for the vector.  How can I get this number?
I'm going to use the 2 vectors to create one record with id, time1 of the
first record of the original data and time 2 of the last record of the
original data so I don't need to create new vectors for the next set of
continuous records.

Desired output:

id      time1   time2
1       100     400
1       500     700
1       800     830
2       800     930


*ANSWER by rlevesque@videotron.ca.

DATA LIST LIST /id time1 time2.
BEGIN DATA
1       100     200
1       200     300
1       300     400
1       500     600
1       600     700
1       800     830
2       800     900
2       900     930
END DATA.
LIST.

* this assumes the file is already sorted.
MATCH FILES FILE=* /BY id /FIRST=first /LAST=last.
CREATE time1a= LEAD(time1,1).
CREATE time2a= LAG(time2,1).

* identify the beginning of a period.
DO IF first.
COMPUTE beg1=1.
ELSE.
COMPUTE beg1=time1<>time2a.
END IF.

* identify the end of a period.
DO IF last.
COMPUTE end1=1.
ELSE.
COMPUTE end1=time2<>time1a.
END IF.

* Get the end values.
DO IF end1.
COMPUTE val2=time2.
END IF.

* Copy the end values to the line of the beginning values.
COMPUTE casenb=$CASENUM.
SORT CASES BY casenb(D).
IF MISSING(val2) val2=LAG(val2).
SORT CASES BY casenb.

* Keep only the lines we want.
EXECUTE.
SELECT IF beg1=1.
MATCH FILES FILE=*  /KEEP=id time1 val2.
RENAME VARIABLES (val2=time2).
FORMATS id time1 time2 (F8.0).
EXECUTE.