One of the useful functions of tapply is to calculate the average value of a given set of factors.
For example, to calculate the average score of different type of students, if we divide the students into the following categories:
1) A student who always ask questions
2) A student who frequently asks questions
3) A student who rarely asks questions
4) A student who never ask questions
moody <- read.csv("https://raw.githubusercontent.com/kunal0895/RDatasets/master/Moody2018.csv")
ask_question_grade <- tapply(moody$SCORE,moody$ASKS_QUESTIONS,mean)
#tapply(numerical value, categorical value, the function to calculate the average numerical value of each category of data)
ask_question_grade
The other function of tapply is to find the range of each of the categorical value that you want to find out, in other words, find the max and min value.
For example, we want to know the range of a particular numerical value (SCORE) for different types of students (for each type of a categorical value). The categorical values are divided into students who always ask questions, who frequently ask questions, who rarely ask questions and who never ask questions.
moody <- read.csv("https://raw.githubusercontent.com/kunal0895/RDatasets/master/Moody2018.csv")
ask_question_range <- tapply(moody$SCORE,moody$ON_SMARTPHONE,range)
#tapply(numerical value, categorical value, the function to calculate the range of numerical value of each categorical data)
ask_question_range
The result is the range of the scores for the students of each category type.