How do we generate random numbers between two given numbers?
runif function generates values from the Uniform distribution. It will give a number between two numbers, excluding the numbers themselves.
ranNum <- runif(1,14.0,20.0)
ranNum
The 'runif' function takes three arguments. The first one signifies how many random numbers we want to generate. The second one is the lower limit and the third one is for the upper limit. Let's generate 4 random numbers.
ranNum <- runif(4,14.0,20.0)
ranNum
Notice that ranNum is now a vector since it holds more than one value.
What happens if we do not specify the upper limit and the lower limit and just specify how many random numbers we want?
ranNum <- runif(4)
ranNum
Yes, the default range is 0 to 1.
The above numbers we generated were floats. What if we want to generate integers?
We use sample function in such case. Sample function takes a vector of integers from which random numbers are to be generated and the second argument specifies how many random numbers are needed.
ranNum <- sample(c(1,2,3,4,5,6,7,8,9,10), 1)
ranNum
If all the numbers are consecutive, then providing them all in the vector is a bit of an inefficient way of doing things, isn't it?
We can specify the range instead. For example, if we say 1:10, it means all numbers from 1 to 10 including 1 and 10.
ranNum <- sample(1:10, 1)
ranNum
Let us now generate multiple random numbers using the sample method.
ranNum <- sample(1:10, 5)
ranNum
Now run the above command a few times and see if you can get a set in which a number appears twice.
You won't. So what happens if the number of random numbers that you want is more than your range? Let's find out. Let's try to draw 11 random numbers between 1 and 10.
ranNum <- sample(1:10, 11)
ranNum
What does the error say? Can you think of a solution to this problem by looking at the error message?
What if we can somehow allow the method to draw a number more than once? That should solve the problem, right? In other words, we are saying that if a number is once picked, we want it to be replaced in the population (the set from which we are picking random numbers) so that it is again eligible to be picked up. This can be done by setting replace=TRUE.
ranNum <- sample(1:10, 5, replace = TRUE)
ranNum
Now try to draw 11 numbers and see what happens.
ranNum <- sample(1:10, 11, replace = TRUE)
ranNum
We should get at least one number in the lot that is repeated at least twice (The pigeonhole principle, anyone?...no? That's fine, it is not really related to the topic anyway!)
So far we have been dealing with numbers. Can we draw other values? Like from a list of things?
Yes, we can. In fact, its the same procedure.
If I want to pick 4 guys at random from the names of 10 guys that I have, I can use the sample function to do so.
sample(c("Adam","Bob","Charles","Doug","Eric","Frank","Glen","Harry","Ivan","James"), 4)
Now, of course, the rule - that you cannot draw more names than the number of names in the population without setting replace = TRUE applies here too.
But what about if you sample names exactly equal to the size of the population? For example, if I draw 10 names at random from a set of 10 names?
The result seems a bit unuseful, isn't it? Can you think of any significance of this result?
sample(c("Adam","Bob","Charles","Doug","Eric","Frank","Glen","Harry","Ivan","James"), 10)
What if you are asked to randomize the sequence of the names in a vector? Don't you think this could be of use in such a case? Not so unuseful after all!
What if we want to generate numbers from a normal distribution instead of a uniform one?
We use rnorm in such a case. By default, the mean is 0 and the standard deviation is 1 for this function.
ranNum <- rnorm(4)
ranNum
If you want to provide your own mean and standard deviation, then that can be done too.
ranNum <- rnorm(4,mean=25, sd=10)
ranNum
Let's draw 50 such numbers from a normal distribution and plot a histogram to see what it looks like.
ranNum <- rnorm(50,mean=25, sd=10)
hist(ranNum)