Generate random dates | Raynald's SPSS Tools

SPSS AnswerNet: Result 

Solution ID:	 	100000483	
Product:	 	SPSS Base 	
Version:	 		
O/S:	 		
Question Type:	 	Syntax/Batch/Scripting	
Question Subtype:	 	Date and Time	
Title:
Generating random dates from a date range 
Description:
Q. 
How can I generate 100 dates randomly, drawing from the 
range of Jan. 1, 1920 to Dec. 31, 1989, inclusive? 
A. 
Two syntax jobs are presented below. The first samples 
dates in the range with replacement, so a given date may appear 
more than once. The second job samples dates without replacement. 
Code is provided at the end of the note for a rough check of the 
distribution of dates generated by either job. 
Job 1: Dates Sampled with Replacement: 
Dates are stored in SPSS as the number of seconds since the 
beginning of the Gregorian calendar, i.e. midnight on 
Oct. 14, 1582. To create each random date, this program 
generates a random number of seconds in the range implied by 
the dates that you specify and stores the number in the 
variable RDAY . The XDATE.DATE function removes the fractional 
part of the day so that RDAY is the number of seconds to 
midnight of the same day. The display of RDAY is then formatted 
with an SPSS date format (ADATE10 in this example). 
Note that the end date chosen is one day later than the last 
date in the desired range. This is because the uniform number 
generator rv.uniform(a,b) generates a number between a and b, 
exclusive. The largest number that can be generated is the 
number corresponding to 11:59:59 on Dec. 31, 1989. 
For the starting point, you do not need to specify the 
day preceding your desired start date, however. Although the 
exact stroke of midnight on Jan. 1, 1920 can not be generated 
by rv.uniform(date.dmy(1,1,1920)), any point in the following 
second and throughout that day can be generated. 
* generate 100 random dates from 1/1/1920 to 31/12/1989) . 
* with replacement . 
new file. 
input program. 
loop #i = 1 to 100. 
compute rday = 
xdate.date(rv.uniform(date.dmy(1,1,1920),date.dmy(1,1,1990))). 
end case. 
end loop. 
end file. 
end input program. 
execute. 
formats rday (adate10). 
Job 2. Dates Sampled Without Replacement: 
To sample without replacement, a file is constructed where 
every date appears exactly once, in a variable named RDAY. 
RDAY is first created as the number of days from the 
beginning of the Gregorian calendar to the date implied by 
the case's sequence number, where the first case is 
1/1/1920 and each subsequent case represents an increase of 
one day. Using the YRMODA function saves you the work of 
calculating the number of days for the start and end of the loop. 
A random number is generated for each case as the case is added 
to the active file. After all cases are added, this random 
number is then ranked, with the rank stored in the variable RX. 
These ranks thus provide a random ordering for the days. 
Selecting cases with RX less than or equal to 100 selects 
the first 100 randomly-ordered days from the full set. 
Multiplying RDAY by 86,400 (the number of seconds in a day) 
places RDAY on a date scale and a standard SPSS date format 
can be applied. 
In contrast to the generation of RDAY in Job 1, the scale of the 
random number X in Job 2 is irrelevant. Also, the start and 
end points for the loop are both included in the file before 
sampling. You don't have to ask for Jan. 1, 1990 to include 
Dec. 31, 1989. 
* generate 100 random dates from 1/1/1920 to 31/12/1989) . 
* no replacement . 
new file. 
input program. 
loop rday = yrmoda(1920,1,1) to yrmoda(1989,12,31). 
compute x = rv.uniform(0,1). 
end case. 
end loop. 
end file. 
end input program. 
execute. 
rank variables = x /rank into rx. 
select if (rx <= 100). 
execute. 
compute rday = rday*86400 . 
formats rday (adate10). 
execute. 
* you can delete x and rx when you save the file. 
* For both jobs, the following gives you a rough check 
* of the distribution of your generated dates. 
* to get a distribution of decades collapse years to decade start. 
compute decade = trunc(xdate.year(rday)/10)*10. 
formats decade (f8). 
frequencies decade. 
Created on: 08/25/1999
...
Navigate from here