Chapter 10 Matrices in R

In R matrices are two-dimensional collections of elements all of which have the same mode or type. This is different than a data frame in which the columns of the frame can hold elements of different type (but all of the same length), or from a list which can hold objects of arbitrary type and length. Matrices are more efficient for carrying out most numerical operations, so if you’re working with a very large data set that is amenable to representation by a matrix you should consider using this data structure.

library(tidyverse)

10.1 Creating matrices in R

There are a number of different ways to create matrices in R. For creating small matrices at the command line you can use the matrix() function.

> x <- matrix(1:5) # creates a column vector
> x
##      [,1]
## [1,]    1
## [2,]    2
## [3,]    3
## [4,]    4
## [5,]    5
> X <- matrix(1:12, nrow=4) # creates a matrix
> X
##      [,1] [,2] [,3]
## [1,]    1    5    9
## [2,]    2    6   10
## [3,]    3    7   11
## [4,]    4    8   12
> dim(X) # give the shape of the matrix
## [1] 4 3

matrix() takes a data vector as input and the shape of the matrix to be created is specified by using the nrow and ncol arguments. If the number of elements in the input data vector is less than nrows \(\times\) ncols the elements will be ‘recycled’ as discussed in previous chapters. Without any shape arguments the matrix() function will create a column vector as shown above. By default the matrix() function fills in the matrix in a column-wise fashion. To fill in the matrix in a row-wise fashion use the argument byrow=T.

If you have a pre-existing data set in a list or data frame you can use the as.matrix() function to convert it to a matrix.

> iris.mtx <- as.matrix(iris)
> head(iris.mtx) # NOTE: the elements were all converted to character
##      Sepal.Length Sepal.Width Petal.Length Petal.Width Species 
## [1,] "5.1"        "3.5"       "1.4"        "0.2"       "setosa"
## [2,] "4.9"        "3.0"       "1.4"        "0.2"       "setosa"
## [3,] "4.7"        "3.2"       "1.3"        "0.2"       "setosa"
## [4,] "4.6"        "3.1"       "1.5"        "0.2"       "setosa"
## [5,] "5.0"        "3.6"       "1.4"        "0.2"       "setosa"
## [6,] "5.4"        "3.9"       "1.7"        "0.4"       "setosa"

Since all elements of an R matrix must be of the same type, when we passed the iris data frame to as.matrix(), everything was converted to a character due to the presence of the Species column in the data frame.

> # This is probably more along the lines of what you want
> iris.mtx <- iris %>% dplyr::select(-Species) %>% as.matrix
> head(iris.mtx)
##      Sepal.Length Sepal.Width Petal.Length Petal.Width
## [1,]          5.1         3.5          1.4         0.2
## [2,]          4.9         3.0          1.4         0.2
## [3,]          4.7         3.2          1.3         0.2
## [4,]          4.6         3.1          1.5         0.2
## [5,]          5.0         3.6          1.4         0.2
## [6,]          5.4         3.9          1.7         0.4

You can use the various indexing operations to get particular rows, columns, or elements. Here are some examples:

> X <- matrix(1:12, nrow=4)
> X
##      [,1] [,2] [,3]
## [1,]    1    5    9
## [2,]    2    6   10
## [3,]    3    7   11
## [4,]    4    8   12
> X[1,] # get the first row
## [1] 1 5 9
> X[,1] # get the first column
## [1] 1 2 3 4
> X[1:2,] # get the first two rows
##      [,1] [,2] [,3]
## [1,]    1    5    9
## [2,]    2    6   10
> X[,2:3] # get the second and third columns
##      [,1] [,2]
## [1,]    5    9
## [2,]    6   10
## [3,]    7   11
## [4,]    8   12
> Y <- matrix(1:12, byrow=T, nrow=4)
> Y
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [3,]    7    8    9
## [4,]   10   11   12
> Y[4] # see explanation below
## [1] 10
> Y[5]
## [1] 2
> dim(Y) <- c(2,6)  # reshape Y
> Y
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    1    7    2    8    3    9
## [2,]    4   10    5   11    6   12
> Y[5]
## [1] 2

The example above where we create a matrix Y is meant to show that matrices are stored internally in a column wise fashion (think of the columns stacked one atop the other), regardless of whether we use the byrow=T argument. Therefore using single indices returns the elements with respect to this arrangement. Note also the use of assignment operator in conjuction with the dim() function to reshape the matrix. Despite the reshaping, the internal representation in memory hasn’t changed so Y[5] still gives the same element.

You can use the diag() function to get the diagonal of a matrix or to create a diagonal matrix as show below:

> Z <- matrix(rnorm(16), ncol=4)
> Z
##            [,1]         [,2]       [,3]       [,4]
## [1,]  0.2523397  0.729350262  0.4769644 -2.0207988
## [2,]  0.1238106  0.002260243  0.6107511 -0.2396683
## [3,] -0.6292392 -0.356499431 -2.0468266  0.4721059
## [4,]  3.4700328  2.466565189 -1.3042726  0.7572256
> diag(Z)
## [1]  0.252339744  0.002260243 -2.046826558  0.757225599
> diag(5) # create the 5 x 5 identity matrix
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    0    0    0    0
## [2,]    0    1    0    0    0
## [3,]    0    0    1    0    0
## [4,]    0    0    0    1    0
## [5,]    0    0    0    0    1
> s <- sqrt(10:13)
> diag(s)
##          [,1]     [,2]     [,3]     [,4]
## [1,] 3.162278 0.000000 0.000000 0.000000
## [2,] 0.000000 3.316625 0.000000 0.000000
## [3,] 0.000000 0.000000 3.464102 0.000000
## [4,] 0.000000 0.000000 0.000000 3.605551

10.2 Matrix arithmetic operations in R

The standard mathematical operations of addition and subtraction and scalar multiplication work element-wise for matrices in the same way as they did for vectors. Matrix multiplication uses the operator %*% which you saw last week for the dot product. To get the transpose of a matrix use the function t().

> A <- matrix(1:12, nrow=4)
> A
##      [,1] [,2] [,3]
## [1,]    1    5    9
## [2,]    2    6   10
## [3,]    3    7   11
## [4,]    4    8   12
> 
> B <- matrix(rnorm(12), nrow=4)
> B
##            [,1]        [,2]       [,3]
## [1,]  0.9290181  1.06809094 -0.3345056
## [2,] -1.5126220 -0.41861130 -0.2329943
## [3,] -1.1397590  0.03192664 -0.9134459
## [4,] -1.1113718 -0.69490935 -1.5228054
> 
> t(A)  # transpose
##      [,1] [,2] [,3] [,4]
## [1,]    1    2    3    4
## [2,]    5    6    7    8
## [3,]    9   10   11   12
> A + B  # matrix addition
##          [,1]     [,2]      [,3]
## [1,] 1.929018 6.068091  8.665494
## [2,] 0.487378 5.581389  9.767006
## [3,] 1.860241 7.031927 10.086554
## [4,] 2.888628 7.305091 10.477195
> A - B  # matrix subtraction
##            [,1]     [,2]      [,3]
## [1,] 0.07098188 3.931909  9.334506
## [2,] 3.51262199 6.418611 10.232994
## [3,] 4.13975904 6.968073 11.913446
## [4,] 5.11137177 8.694909 13.522805
> 5 * A  # multiplication by a scalar
##      [,1] [,2] [,3]
## [1,]    5   25   45
## [2,]   10   30   50
## [3,]   15   35   55
## [4,]   20   40   60

When applying matrix multiplication, the dimensions of the matrices involved must be conformable. For example, you can’t do this:

A %*% B  # do you understand why this generates an error?

But this works:

> A %*% t(B)
##          [,1]       [,2]       [,3]      [,4]
## [1,] 3.258923  -5.702627  -9.201139 -18.29117
## [2,] 4.921526  -7.866855 -11.222417 -21.62025
## [3,] 6.584130 -10.031083 -13.243695 -24.94934
## [4,] 8.246733 -12.195310 -15.264974 -28.27843

10.3 Descriptive statistics as matrix functions

Assume you have a data set represented as a \(n \times p\) matrix, \(X\), with observations in rows and variables in columns. Below I give formulae for calculating some descriptive statistics as matrix functions.

10.3.1 Mean vector and matrix

You can calculate a row vector of means, \(\mathbf{m}\), as: \[ \mathbf{m} = \frac{1}{n} \mathbf{1}^T X \] where \(1\) is a \(n \times 1\) vector of ones.

A \(n \times p\) matrix \(M\) where each column is filled with the mean value for that column is: \[ M = \mathbf{1}\mathbf{m} \]

10.3.2 Deviation matrix

To re-express each value as the deviation from the variable means (i.e.~each columns is a mean centered vector) we calculate a deviation matrix: \[ D = X - M \]

10.3.3 Covariance matrix

The \(p \times p\) covariance matrix can be expressed as a matrix product of the deviation matrix: \[ S = \frac{1}{n-1} D^T D \]

10.3.4 Correlation matrix

The correlation matrix, \(R\), can be calculated from the covariance matrix by: \[ R = V S V \]

where \(V\) is a \(p \times p\) diagonal matrix where \(V_{ii} = 1/\sqrt{S_{ii}}\).

10.4 Matrix Inverse

The function solve() can be used to find matrix inverses in R.

A <- matrix(1:4, nrow=2)
A
##      [,1] [,2]
## [1,]    1    3
## [2,]    2    4

Ainv <- solve(A)
Ainv
##      [,1] [,2]
## [1,]   -2  1.5
## [2,]    1 -0.5

A %*% Ainv   # should give identity matrix
##      [,1] [,2]
## [1,]    1    0
## [2,]    0    1
Ainv %*% A   # should also result in identity matrix
##      [,1] [,2]
## [1,]    1    0
## [2,]    0    1

Keep in mind that not all square matrices are invertible:

C <- matrix(1:16, nrow=4)
C
##      [,1] [,2] [,3] [,4]
## [1,]    1    5    9   13
## [2,]    2    6   10   14
## [3,]    3    7   11   15
## [4,]    4    8   12   16

Cinv <- solve(C)
## Error in solve.default(C): Lapack routine dgesv: system is exactly singular: U[4,4] = 0

10.5 Solving sets of simultaneous equations

The solve() function introduced above can also be used to solve sets of simultaneous equations.

For example, given the set of equations below:

\[ \begin{eqnarray*} x_1 + 3x_2 + 2x_3 & = & 3\\ -x_1 + x_2 + 2x_3 & = & -2\\ 2x_1 + 4x_2 -x_3 & = & 10 \end{eqnarray*} \]

We can rewrite this in vector form as:

\[x_1 \begin{bmatrix} 1 \\ -1 \\ 2 \end{bmatrix} + x_2 \begin{bmatrix} 3 \\ 1 \\ 4 \end{bmatrix} + x_3 \begin{bmatrix} 2 \\ 2 \\ 1 \end{bmatrix} = \begin{bmatrix} 3 \\ -2 \\ 10 \end{bmatrix} \] which is equivalent to the matrix form:

\[ \begin{bmatrix} 1 & 3 & 2 \\ -1 & 1 & 2\\ 2 & 4 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 3 \\ -2 \\ 10 \end{bmatrix} \] solve() takes two arguments: - a – a square numeric matrix containing the coefficients of the linear equations (the left-most matrix above) - b – a vector (or matrix) giving the right hand side of the linear equations

First let’s create the matrix of coefficients on the left and right sides:

A = matrix( c(c(1, 3 , 2),
              c(-1, 1, 2),
              c(2, 4, 1)), 
            nrow=3, ncol=3, byrow=TRUE)

b = c(3, -2, 10)

Now we solve the equations:

x <- solve(A,b)
x
## [1] -2.25  4.75 -4.50

Let’s confirm the solution works by multiplying A by x:

A %*% x
##      [,1]
## [1,]    3
## [2,]   -2
## [3,]   10

Voila!