Note: There are often multiple ways to answer each question.

  1. Create a vector of integers from -5 to 10 (inclusive) and assign it to the variable x.
x <- -5:10
x
##  [1] -5 -4 -3 -2 -1  0  1  2  3  4  5  6  7  8  9 10
  1. Create a vector which consists of the first 10 multiples of 3 (i.e. 3, 6, …, 30) and assign it to the variable y.
y <- 1:10 * 3
y
##  [1]  3  6  9 12 15 18 21 24 27 30
  1. What is the result of x + y? Why does R return this result?
x + y
## Warning in x + y: longer object length is not a multiple of shorter object
## length
##  [1] -2  2  6 10 14 18 22 26 30 34  8 12 16 20 24 28

R tries to do addition element-wise. This is straightforward when the two vectors have the same length. However, x has length 16 while y has length 10, so what R does is “recycle” the shorter vector to have the same length as the longer one. Hence, the output is really -5:10 + c(3, 6, ..., 30, 3, 6, ..., 18).

  1. What is the result of z <- c(1, 2, "3")? Why does R return this result?
z <- c(1, 2, "3")
z
## [1] "1" "2" "3"

In R, elements of one vector need to be of the same type. In this case, we are giving the vector both numeric and character types. All numeric variables can be changed easily to character variables (but not vice versa), so R silently converts 1 and 2 to “1” and “2”.

  1. Create the following list:
person <- list(name = "John Doe",
               age = 26,
               classes = c("ENG", "MAT", "SCI", "SPA", "MUS"))

What is the result of person$classes[2]? Why does R return this result?

person$classes[2]
## [1] "MAT"

The code is the same as (person$classes)[2]. person$classes gives us the vector c("ENG", "MAT", "SCI", "SPA", "MUS"), and the [2] extracts the second element of this vector.

  1. What code can I use to find out how many classes John Doe took?
length(person$classes)
## [1] 5
  1. Load the vehicles dataset that we used in class. How many rows and columns are there in this dataset? What are the column names?
library(fueleconomy)
data(vehicles)
dim(vehicles)
## [1] 33442    12
names(vehicles)  # colnames(mtcars) also works
##  [1] "id"    "make"  "model" "year"  "class" "trans" "drive" "cyl"   "displ"
## [10] "fuel"  "hwy"   "cty"
  1. What is the highest hwy value?
max(vehicles$hwy)
## [1] 109
  1. How many of each value are there in the cyl column?
table(vehicles$cyl)
## 
##     2     3     4     5     6     8    10    12    16 
##    45   182 12381   718 11885  7550   138   478     7

Note that the table() function will not show you if there are NAs or not! To find out the number of NAs, you can use summary(vehicles$cyl) or sum(is.na(vehicles$cyl)).

  1. What is the mean, median and standard deviation of the cty column?
mean(vehicles$cty)
## [1] 17.491
median(vehicles$cty)
## [1] 17
sd(vehicles$cty)
## [1] 5.582174