mosaicplot

mosaicplot(x(a table),...)
Mosaic plot is a graphical method for visualizing data from two or more qualitative variables.

If you have 2 categorical variables to plot, you use a mosaic plot.

Example

Below is an example of a basic mosaic plot.

 moody <- read.csv("https://raw.githubusercontent.com/kunal0895/RDatasets/master/Moody2018.csv")
 mosaicplot(moody$GRADE~moody$ON_SMARTPHONE)

Let's change the alignment of the text in order to leave more space and for better visibility.

 moody <- read.csv("https://raw.githubusercontent.com/kunal0895/RDatasets/master/Moody2018.csv")
 mosaicplot(moody$GRADE~moody$ON_SMARTPHONE,las=1)
 #"las" rotates the text.

You can also interchange the X-axis and Y-axis.

 moody <- read.csv("https://raw.githubusercontent.com/kunal0895/RDatasets/master/Moody2018.csv")
 mosaicplot(t(table(moody$GRADE,moody$ON_SMARTPHONE)))

 #t() means transfer X and Y axis. The table function should be used because t() can only transfer a matrix.

 #Interchanging X and Y-axis does not seem like a good idea in this case.

Let's add some colour to distinguish those blocks.

 moody <- read.csv("https://raw.githubusercontent.com/kunal0895/RDatasets/master/Moody2018.csv")
 mosaicplot(moody$GRADE~moody$ON_SMARTPHONE,color=c("blue","red"))

Sometimes there are too many blocks for you to fill the colors up in each of them one by one, so we may do that in an easier way.

 moody <- read.csv("https://raw.githubusercontent.com/kunal0895/RDatasets/master/Moody2018.csv")
 mosaicplot <-mosaicplot(moody$ASKS_QUESTIONS~moody$GRADE,col=c(1:5))
 #c(1:5) means automatically fill color to blocks using colour No.1 to No.5.

We can also add a title and a subtitle to it.

 moody <- read.csv("https://raw.githubusercontent.com/kunal0895/RDatasets/master/Moody2018.csv")
 mosaicplot(moody$GRADE~moody$ON_SMARTPHONE,color=c("blue","red"),main="Grade vs Smartphone",sub="5 categories")

 #"main" means the title, "sub" means the subtitle

Let's change the labels of X-axis and Y-axis.

 moody <- read.csv("https://raw.githubusercontent.com/kunal0895/RDatasets/master/Moody2018.csv")
 mosaicplot(moody$GRADE~moody$ON_SMARTPHONE,xlab="Grade",ylab="Frequency")

What if we only want to observe the behavior of students who got an "A" or an "F"?

 moody <- read.csv("https://raw.githubusercontent.com/kunal0895/RDatasets/master/Moody2018.csv")
 grade1 <- subset(moody,moody$GRADE=="A" | moody$GRADE=="F")
 #select the data which satisfies "GRADE=A or F"
 mosaicplot(grade1$GRADE~grade1$ON_SMARTPHONE)

 #Observe the plot carefully. The variables "B", "C" "D" still exist even though they are 0, which makes the plot confusing.

So we need to add some more code.

 moody <- read.csv("https://raw.githubusercontent.com/kunal0895/RDatasets/master/Moody2018.csv")
 grade1 <- subset(moody,moody$GRADE=="A"|moody$GRADE=="F")
 grade1$GRADE <- factor(grade1$GRADE)
 #change the factor level of GRADE in grade1
 mosaicplot(grade1$GRADE~grade1$ON_SMARTPHONE)

Let's do a more complex task: Apart from selecting grades A and F, select the frequencies "never" and "always".

 moody <- read.csv("https://raw.githubusercontent.com/kunal0895/RDatasets/master/Moody2018.csv")
 grade1 <- subset(moody,moody$GRADE=="A"|moody$GRADE=="F")
 grade2 <- subset(grade1,grade1$ASKS_QUESTIONS=="never"|grade1$ASKS_QUESTIONS=="always")
 question <- factor(grade2$ASKS_QUESTIONS)
 grade <- factor(grade2$GRADE)
 mosaicplot(table(grade,question))
 #Generate a table(matrix) for the grade vector and the question vector.