SummR Camp

# SummR Camp
## Mathematics Review: Matrices and Functions
### Kat Sadikova (many thanks to Louisa H. Smith)
### August 26, 2020

---

class:middle,center

.left-code {
  #color: #777;
  width: 43%;
  height: 92%;
  float: left;
  #font-size: 0.8em;
  position: absolute;
}
.right-plot {
  width: 50%;
  float: right;
  padding-left: 5%;
}
.left-col {
  width: 60%;
  float: left;
  position: absolute;
}
.right-col {
  width: 30%;
  float: right;
  padding-left: 5%;
}
.plot-callout {
  height: 225px;
  width: 450px;
  bottom: 5%;
  right: 5%;
  position: absolute;
  padding: 0px;
  z-index: 100;
}
.plot-callout img {
  width: 100%;
  border: 4px solid #23373B;
}

h4 {
  color: #F97B64;
  font-size: 22px;
}

h5 {
  color: #317B33;
  font-size: 22px;
}

h1, h2, h3, h4, h5 {
  margin-top:0;
}

.inverse h1, .inverse h2, .inverse h3 {
  color: #1F4257;
}
.remark-slide thead, .remark-slide tr:nth-child(2n) {
  background-color: white;
}
.title-slide, .title-slide h1, .title-slide h2, .title-slide h3 {
color: #FFFFFF;
}
</style>

#### Intuition and basic tools relevant for quantitative methods in population health sciences

---

class:middle,center
.left[
### Scalars, vectors, matrices
]

`$a = a \text{ , a scalar}$`

<br>
--

`$\mathbf{y} =\begin{bmatrix}y_1 \\ y_2 \\ \vdots \\ y_n\end{bmatrix}\text{,  a vector of length } n$`

<br>
--

`$\mathbf{X} = \begin{bmatrix} x_{11} & x_{12} & \cdots & x_{1p} \\ x_{21} & x_{22} & \cdots & x_{2p} \\ \vdots &\vdots & \ddots & \vdots \\x_{n1} & x_{n2} & \cdots & x_{np}\end{bmatrix} \text{,  an }n \times p \text{ matrix}$`

---

![](https://media.giphy.com/media/5yLgocAcF0DYJHZtQqs/giphy.gif)

Imagine this very happy Ron Swanson **Rowing a Canoe** and remember that **Rows come before Columns** when indexing elements of a matrix
]

`$$\mathbf{X}_{3 \times 4} = \begin{bmatrix} x_{11} & x_{12} & x_{13} & x_{14}\\ x_{21} & x_{22} & x_{23} & x_{24}\\ x_{31} & x_{32} & x_{33} & x_{34} \end{bmatrix}_{3\times 4}$$`
<br>
--

Individual elements are indexed in the same *row-column* order: `$x_{23}$` is the element in the second row and third column.
]
---

class:middle
.left[
### Data example
]

.left-col[
Let's look at a simple example - data stored in a matrix `$\mathbf{X}$` with 10 individuals and 4 variables:
- age in years
- height in inches
- a dichotomous indicator of whether the individual likes dogs
- year of birth
]

`$\mathbf{X} = \begin{bmatrix} 25 & 67 & \cdots & 1995 \\ 38 & 63 & \cdots & 1982 \\ \vdots &\vdots & \ddots & \vdots \\ 41 & 59 & \cdots & 1979 \end{bmatrix}$`

`$\mathbf{age} = \begin{bmatrix} 25 \\ 38 \\ \vdots \\ 41 \end{bmatrix}$`

`$age_{1} = 25$`
]
---

__Build the data set__

```r
set.seed(6789)
n <- 10
dat <- data.frame(
  age = round(runif(n, 22, 45)), height = round(rnorm(n, 66, 4)), likes_dogs = rbinom(n, 1, .53)
)
dat$yob <- 2020 - dat$age
dat <- as.matrix(dat)

dat
```

```
##       age height likes_dogs  yob
##  [1,]  25     67          0 1995
##  [2,]  38     63          1 1982
##  [3,]  35     66          1 1985
##  [4,]  36     70          1 1984
##  [5,]  23     62          1 1997
##  [6,]  26     73          1 1994
##  [7,]  44     68          0 1976
##  [8,]  23     70          1 1997
##  [9,]  25     70          1 1995
## [10,]  41     59          0 1979
```
---

__What are the dimensions of this matrix?__

```r
# both dimensions
dim(dat)
```

```
## [1] 10  4
```

```r
# number of rows
nrow(dat)
```

```
## [1] 10
```

```r
# number of columns
ncol(dat)
```

```
## [1] 4
```
---

__Let's inspect parts of this matrix (vectors and scalars)__

```r
# Extract everyone's age (column vector)
dat[,1]
```

```
##  [1] 25 38 35 36 23 26 44 23 25 41
```

```r
# Find out if participant 2 likes dogs (scalar)
dat[2,3]
```

```
## likes_dogs 
##          1
```

##### Q: How do we get all the data for participant 7?

---

### The geometry of vectors
.left-code[
Let's look at the age, height and shoe size of participants 1 and 2:
`$$\mathbf{x}_1 = \begin{bmatrix} 25 \\ 67 \\10.5 \end{bmatrix} \quad \mathbf{x}_2 = \begin{bmatrix} 38 \\ 66.5 \\ 7 \end{bmatrix}$$`

The more attributes we measure about people, the higher-dimensional the space -- and the more precisely we can describe each person (they're not near anyone else in space).
]

.right-plot[
<div id="htmlwidget-917b0c9d60dce2073e5e" style="width:500px;height:500px;" class="plotly html-widget"></div>
<script type="application/json" data-for="htmlwidget-917b0c9d60dce2073e5e">{"x":{"visdat":{"f35817c0e912":["function () ","plotlyVisDat"]},"cur_data":"f35817c0e912","attrs":{"f35817c0e912":{"x":{},"z":{},"y":{},"mode":"markers","color":{},"size":[4,4,0,0],"colors":["red","white","blue"],"alpha_stroke":1,"sizes":[10,100],"spans":[1,20],"type":"scatter3d"}},"layout":{"width":500,"height":500,"margin":{"b":40,"l":60,"t":25,"r":10},"scene":{"yaxis":{"title":"age"},"xaxis":{"title":"shoe size"},"zaxis":{"title":"height"},"camera":{"up":{"x":0,"y":0,"z":1},"eye":{"x":2.5,"y":0.1,"z":0.1},"center":{"x":0,"y":0,"z":0}}},"showlegend":false,"hovermode":"closest"},"source":"A","config":{"showSendToCloud":false},"data":[{"x":[7],"z":[66.5],"y":[38],"mode":"markers","type":"scatter3d","name":"blue","marker":{"color":"rgba(255,0,0,1)","size":[100],"sizemode":"area","line":{"color":"rgba(255,0,0,1)"}},"textfont":{"color":"rgba(255,0,0,1)","size":100},"error_y":{"color":"rgba(255,0,0,1)","width":100},"error_x":{"color":"rgba(255,0,0,1)","width":100},"line":{"color":"rgba(255,0,0,1)","width":100},"frame":null},{"x":[6,11],"z":[60,70],"y":[24,39],"mode":"markers","type":"scatter3d","name":"clear","marker":{"color":"rgba(255,255,255,1)","size":[10,10],"sizemode":"area","line":{"color":"rgba(255,255,255,1)"}},"textfont":{"color":"rgba(255,255,255,1)","size":10},"error_y":{"color":"rgba(255,255,255,1)","width":10},"error_x":{"color":"rgba(255,255,255,1)","width":10},"line":{"color":"rgba(255,255,255,1)","width":10},"frame":null},{"x":[10.5],"z":[67],"y":[25],"mode":"markers","type":"scatter3d","name":"red","marker":{"color":"rgba(0,0,255,1)","size":[100],"sizemode":"area","line":{"color":"rgba(0,0,255,1)"}},"textfont":{"color":"rgba(0,0,255,1)","size":100},"error_y":{"color":"rgba(0,0,255,1)","width":100},"error_x":{"color":"rgba(0,0,255,1)","width":100},"line":{"color":"rgba(0,0,255,1)","width":100},"frame":null}],"highlight":{"on":"plotly_click","persistent":false,"dynamic":false,"selectize":false,"opacityDim":0.2,"selected":{"opacity":1},"debounce":0},"shinyEvents":["plotly_hover","plotly_click","plotly_selected","plotly_relayout","plotly_brushed","plotly_brushing","plotly_clickannotation","plotly_doubleclick","plotly_deselect","plotly_afterplot","plotly_sunburstclick"],"base_url":"https://plot.ly"},"evals":[],"jsHooks":[]}</script>
]
---

class:middle
.left[
### Transposing vectors and matrices
]

We can flip a vector on its side by **transposing** it.

`$\mathbf{x}_1 = \begin{bmatrix} 25 \\ 67 \\10.5 \end{bmatrix} \quad \mathbf{x}_2 = \begin{bmatrix} 38 \\ 66.5 \\ 7 \end{bmatrix}$`

`$\mathbf{x}_1^T = \begin{bmatrix} 25 & 67 & 10.5 \end{bmatrix} \quad \mathbf{x}_2^T = \begin{bmatrix} 38 & 66.5 & 7 \end{bmatrix}$`

<br>
--

**Note:** We may also refer to transposed vectors (and matrices) with "prime" notation: `$\mathbf{x}_1^T = \mathbf{x}_1'$`
(Not to be confused with a derivative!)

<br>
--

Stack these row vectors and you have a matrix!

`$\mathbf{X} = \begin{bmatrix} \mathbf{x}_1^T \\ \mathbf{x}_2^T \end{bmatrix} = \begin{bmatrix} 25 & 67 & 10.5 \\ 38 & 66.5 & 7 \end{bmatrix}$`

<br>
--

##### Q: What are the dimentions of `$\mathbf{X}$`?

---

class:middle
.left[
### Transposing vectors and matrices
]

We can also think of vectors as single-columned matrices:

- A column vector `$\mathbf{y}$` of length `$n$` has dimensions `$n \times 1$`.
.center[
`$\begin{bmatrix}y_1 \\ y_2 \\ \vdots \\ y_n\end{bmatrix}_{n\times 1}$`
]

- A row vector `$\mathbf{y}'$`  of length `$n$` has dimensions `$1 \times n$`.
.center[
`$\begin{bmatrix}y_1 & y_2 & \cdots & y_n\end{bmatrix}_{1\times n}$`
]

Thinking about vectors this way will be really handy when we multiply vectors and matrices!

---

class:middle
.left[
### Transposing vectors and matrices
]

We can also transpose matrices

`$\mathbf{X} = \begin{bmatrix} 25 & 67 & \cdots & 1995 \\ 38 & 63 & \cdots & 1982 \\ \vdots &\vdots & \ddots & \vdots \\ 41 & 59 & \cdots & 1979 \end{bmatrix}\qquad \mathbf{X}^T = \mathbf{X}' = \begin{bmatrix} 25 & 38 & \cdots & 41 \\ 67 & 63 & \cdots & 59 \\ \vdots &\vdots & \ddots & \vdots \\ 1995 & 1982 & \cdots & 1979 \end{bmatrix}$`

The former columns (variables) are now rows. The former rows (participants) are now columns.

---

class:middle
.left[
### Transposing vectors and matrices
]

We can transpose a matrix using the `t()` function:

```r
t(dat)
```

```
##            [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## age          25   38   35   36   23   26   44   23   25    41
## height       67   63   66   70   62   73   68   70   70    59
## likes_dogs    0    1    1    1    1    1    0    1    1     0
## yob        1995 1982 1985 1984 1997 1994 1976 1997 1995  1979
```
If we save this as another object, we can extract elements using the **opposite** indices as before:

```r
dat_t <- t(dat)
dat[3, 4] # participant 3's year of birth
```

```
##  yob 
## 1985
```

```r
dat_t[4, 3]
```

```
##  yob 
## 1985
```

---
### Vector & matrix multiplication

When we multiply a scalar by a vector or a matrix, we can just do so element by element:

`$a\mathbf{y} = \begin{bmatrix} ay_1 \\ ay_2 \\ \vdots \\ ay_n \end{bmatrix} \qquad \qquad \qquad a\mathbf{X} = \begin{bmatrix} ax_{11} & ax_{12} & \cdots & ax_{1p} \\ ax_{21} & ax_{22} & \cdots & ax_{2p} \\ \vdots &\vdots & \ddots & \vdots \\ ax_{n1} & ax_{n2} & \cdots & ax_{np} \end{bmatrix}$`

### Vector & matrix multiplication

.left-col[
We can only multiply vectors of the same length, and we have to **transpose** one before we can do so.

- Consider two vectors of length `$p$`, `$\mathbf{b}$` and `$\mathbf{c}$`:
- Think of them as matrices, each with 1 column
- To multiply them, their **inner** dimensions must match => must transpose one of them:

`$\mathbf{b}^T\mathbf{c} = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_p \end{bmatrix}^T \begin{bmatrix} c_1 \\ c_2 \\ \vdots \\ c_p \end{bmatrix}$` 
`$= \begin{bmatrix} b_1 & b_2 & \cdots & b_p \end{bmatrix} \begin{bmatrix} c_1 \\ c_2 \\ \vdots \\ c_p \end{bmatrix}$`
`$=b_1c_1 + b_2c_2 + \cdots + b_pc_p$` 
`$= \sum_{i = 1}^p b_ic_i$`
]

![](https://media.giphy.com/media/3WvdC5etwu52rLUAWm/giphy.gif)
]

---

### (Note on dimensions)

Notice that I explicitly wrote out the dimensions when multiplying `$\mathbf{Y}_{m\times n}\mathbf{X}_{n\times p}\,$`

Just like we can only multiply vectors of the same length, dimensions of matrices must be compatible in order to be multiplied.

.center[
<iframe src="https://giphy.com/embed/ZDgXvr2vpsgZG" width="480" height="257" frameBorder="0" class="giphy-embed" allowFullScreen></iframe><p><a href="https://giphy.com/gifs/ZDgXvr2vpsgZG"></a></p>
]

The **inner dimensions** must match, then the resulting product is a matrix with the **outer dimensions**.

### Matrix `$\times$` matrix multiplication

Let's multiply a `$m\times n$` matrix `$\mathbf{Y}$` by the `$n\times p$` matrix `$\mathbf{X}$`:

`$\mathbf{Y}_{m\times n}\mathbf{X}_{n\times p} = \begin{bmatrix} \mathbf{y_{11}} & \mathbf{y_{12}} & \cdots & \mathbf{y_{1n}} \\ y_{21} & y_{22} & \cdots & y_{2n} \\ \vdots &\vdots & \ddots & \vdots \\ y_{m1} & y_{m2} & \cdots & y_{mn} \end{bmatrix} \begin{bmatrix} \mathbf{x_{11}} & x_{12} & \cdots & x_{1p} \\ \mathbf{x_{21}} & x_{22} & \cdots & x_{2p} \\ \vdots &\vdots & \ddots & \vdots \\ \mathbf{x_{n1}} & x_{n2} & \cdots & x_{np}\end{bmatrix}$`
`$= \begin{bmatrix} \mathbf{z_{11}} &  & \cdots &  \\ &  & \cdots & \\ \vdots &\vdots & \ddots & \vdots \\  &  & \cdots &\end{bmatrix}$`

where `$\mathbf{z}_{11}$` is the product of the first **row** of `$\mathbf{Y}$` and the first **column** of `$\mathbf{X}$`

<br>
--

---

### Matrix `$\times$` matrix multiplication

Now we do the same vector-vector multiplication with every pair of a row from `$\mathbf{Y}$` and a column from `$\mathbf{X}$`

`$\mathbf{Y}_{m\times n}\mathbf{X}_{n\times p} = \begin{bmatrix} \mathbf{y_{11}} & \mathbf{y_{12}} & \cdots & \mathbf{y_{1n}} \\ y_{21} & y_{22} & \cdots & y_{2n} \\ \vdots &\vdots & \ddots & \vdots \\ y_{m1} & y_{m2} & \cdots & y_{mn} \end{bmatrix} \begin{bmatrix} x_{11} & \mathbf{x_{12}} & \cdots & x_{1p} \\ x_{21} & \mathbf{x_{22}} & \cdots & x_{2p} \\ \vdots &\vdots & \ddots & \vdots \\ x_{n1} & \mathbf{x_{n2}} & \cdots & x_{np} \end{bmatrix}$`
`$= \begin{bmatrix} z_{11} & \mathbf{z_{12}} & \cdots &  \\ &  & \cdots & \\ \vdots &\vdots & \ddots & \vdots \\ &  & \cdots & \end{bmatrix}$`

(You don't have to do it in any particular order, just keep multiplying until all row-column pairs have been multiplied.)

<br>
--

`$\mathbf{z}_{23} = \sum_{i = 1}^n y_{2i}x_{i3}$`

---

### Matrix `$\times$` matrix multiplication: in R

```r
X <- matrix(c(6, 3, 1, 6, 2, 3), ncol = 3)
X
```

```
##      [,1] [,2] [,3]
## [1,]    6    1    2
## [2,]    3    6    3
```

```r
tX = t(X)
tX
```

```
##      [,1] [,2]
## [1,]    6    3
## [2,]    1    6
## [3,]    2    3
```

```r
Z = tX%*%X
Z
```

```
##      [,1] [,2] [,3]
## [1,]   45   24   21
## [2,]   24   37   20
## [3,]   21   20   13
```

---

### Matrix `$\times$` matrix multiplication: in R

```r
Z = tX%*%X
Z
```

```
##      [,1] [,2] [,3]
## [1,]   45   24   21
## [2,]   24   37   20
## [3,]   21   20   13
```

```r
tX[1,]%*%X[,1]
```

```
##      [,1]
## [1,]   45
```

#### NOTE!!! Order of matrices in the multiplication matters!

```r
W = X%*%tX
W
```

```
##      [,1] [,2]
## [1,]   41   30
## [2,]   30   54
```

---
### Identity matrix

We all know 1 is special. When you multiply a number by 1, you get the same number.

Matrices have their own special matrix, the identity matrix: `$\mathbf{I}$`.

<br>
--

For example:

`$\mathbf{QI} = \begin{bmatrix} r & s & t \\ u & v & w \end{bmatrix}_{2\times3}\begin{bmatrix} 1 & 0 &  0 \\ 0 & 1 & 0 \\  0 & 0 &  1\end{bmatrix}_{3\times3} = \begin{bmatrix} r & s & t \\ u & v & w \end{bmatrix}_{2\times3}$`

---
### Symmetrical matrixes

The identity matrix `$\mathbf{I}$` is an example of a symmetric matrix: `$\mathbf{I}_{i,j}$` = `$\mathbf{I}_{j,i}$` for any `$i$`, `$j$` where `$i \ne j$`

If you multiply a matrix by itself, the result is also a symmetric matrix:

.center[
`$\mathbf{Q}^T\mathbf{Q} = \begin{bmatrix} r^2 + u^2 & rs + uv & rt + uw \\ rs + uv & s^2 + v^2 & st + vw \\ rt + uw & st + vw & t^2 + w^2 \end{bmatrix}$`
]

---

### Inverse of a matrix and linear independence

The inverse of a matrix `$\mathbf{Q}$` is the matrix `$\mathbf{Q^{-1}}$` such that `$$\mathbf{Q}\mathbf{Q^{-1}} = \mathbf{I}$$`

- For a scalar `$a$`, `$a^{-1}=\frac{1}{a}$` is the multiplicative inverse of `$a$`: when we multiply the two together, we get 1
- For a matrix, the inverse is much more difficult to find, and does not exist if the columns are **linearly dependent**

---
### Inverse of a matrix and linear independence: in R
.pull-left[
We can attempt to invert a matrix using the `solve()` function:

```r
mat_a
```

```
##      [,1] [,2]
## [1,]    2    1
## [2,]    6    8
```

```r
solve(mat_a)
```

```
##      [,1] [,2]
## [1,]  0.8 -0.1
## [2,] -0.6  0.2
```
]

We can check to make sure this is the inverse:

```r
mat_a_inv <- solve(mat_a)
mat_a_inv %*% mat_a
```

```
##      [,1] [,2]
## [1,]    1    0
## [2,]    0    1
```
#### We get the identity matrix `$\mathbf{I}$`!
]

---
### Inverse of a matrix and linear independence: in R

What if we have a matrix whose columns are not linearly independent?

```r
mat_b
```

```
##      [,1] [,2]
## [1,]    2    1
## [2,]    6    3
```

```r
solve(mat_b)
```

```
## Error in solve.default(mat_b): Lapack routine dgesv: system is exactly singular: U[2,2] = 0
```

**Whenever you get an error message about something being "singular", that's code for a matrix not being invertible -- check for linear dependence!**

##### Q: How could you tell the matrix above is not invertible?

---

### Inverse of a matrix and linear independence

Consider a situation in which we're predicting height as a linear function of several variables we have in our dataset: age in years, shoe size, and age in months.

`$\textbf{height} = \begin{bmatrix} 51 \\ 61 \\ 52 \\ 65 \\ 60 \\ 48 \end{bmatrix} \qquad \qquad \textbf{other variables} = \begin{bmatrix} 30  & 7 & 360 \\ 31 & 10 & 372 \\ 25 & 9 & 300 \\ 35 & 10 & 420 \\ 42 & 6 & 504 \\ 27 & 7 & 324 \end{bmatrix}$`

]

<div id="htmlwidget-97c534e0866c52485a8f" style="width:220px;height:220px;" class="plotly html-widget"></div>
<script type="application/json" data-for="htmlwidget-97c534e0866c52485a8f">{"x":{"visdat":{"f3582ebc213b":["function () ","plotlyVisDat"]},"cur_data":"f3582ebc213b","attrs":{"f3582ebc213b":{"x":{},"y":{},"mode":"markers","alpha_stroke":1,"sizes":[10,100],"spans":[1,20],"type":"scatter"},"f3582ebc213b.1":{"x":{},"y":{},"mode":"lines","alpha_stroke":1,"sizes":[10,100],"spans":[1,20],"type":"scatter","color":["lightgrey"],"inherit":true}},"layout":{"width":220,"height":220,"margin":{"b":40,"l":60,"t":25,"r":10},"xaxis":{"domain":[0,1],"automargin":true,"title":"age in years"},"yaxis":{"domain":[0,1],"automargin":true,"title":"predicted height"},"showlegend":false,"autosize":false,"hovermode":"closest"},"source":"A","config":{"showSendToCloud":false},"data":[{"x":[30,31,25,35,42,27],"y":[54.9181494661922,55.6672597864769,51.1725978647687,58.6637010676157,63.9074733096085,52.6708185053381],"mode":"markers","type":"scatter","marker":{"color":"rgba(31,119,180,1)","line":{"color":"rgba(31,119,180,1)"}},"error_y":{"color":"rgba(31,119,180,1)"},"error_x":{"color":"rgba(31,119,180,1)"},"line":{"color":"rgba(31,119,180,1)"},"xaxis":"x","yaxis":"y","frame":null},{"x":[25,27,30,31,35,42],"y":[51.1725978647687,52.6708185053381,54.9181494661922,55.6672597864769,58.6637010676157,63.9074733096085],"mode":"lines","type":"scatter","marker":{"color":"rgba(211,211,211,1)","line":{"color":"rgba(211,211,211,1)"}},"textfont":{"color":"rgba(211,211,211,1)"},"error_y":{"color":"rgba(211,211,211,1)"},"error_x":{"color":"rgba(211,211,211,1)"},"line":{"color":"rgba(211,211,211,1)"},"xaxis":"x","yaxis":"y","frame":null}],"highlight":{"on":"plotly_click","persistent":false,"dynamic":false,"selectize":false,"opacityDim":0.2,"selected":{"opacity":1},"debounce":0},"shinyEvents":["plotly_hover","plotly_click","plotly_selected","plotly_relayout","plotly_brushed","plotly_brushing","plotly_clickannotation","plotly_doubleclick","plotly_deselect","plotly_afterplot","plotly_sunburstclick"],"base_url":"https://plot.ly"},"evals":[],"jsHooks":[]}</script>
]

]

---
### Linear independence, cont.

.pull-left[
If we plot predicted height as a function of age in years *and* shoe size, we get a **plane**.

That is, we get more information about height from knowing someone's shoe size. If two people are the same age, but have different shoe sizes, we'll predict different heights for them.
]

<div id="htmlwidget-1bfd6adcfaced3986267" style="width:400px;height:400px;" class="plotly html-widget"></div>
<script type="application/json" data-for="htmlwidget-1bfd6adcfaced3986267">{"x":{"visdat":{"f3581c1a62bf":["function () ","plotlyVisDat"]},"cur_data":"f3581c1a62bf","attrs":{"f3581c1a62bf":{"x":{},"z":{},"y":{},"mode":"markers","alpha_stroke":1,"sizes":[10,100],"spans":[1,20],"type":"scatter3d"},"f3581c1a62bf.1":{"x":{},"z":{},"y":{},"mode":"markers","alpha_stroke":1,"sizes":[10,100],"spans":[1,20],"type":"surface","colorscale":[[0,1],["lightgrey","lightgrey"]],"inherit":true}},"layout":{"width":400,"height":400,"margin":{"b":40,"l":60,"t":25,"r":10},"scene":{"yaxis":{"title":"age in years"},"xaxis":{"title":"shoe size"},"zaxis":{"title":"predicted height"},"camera":{"up":{"x":0,"y":0,"z":1},"eye":{"x":2.5,"y":0.01,"z":0.2},"center":{"x":0,"y":0,"z":0.2}}},"showlegend":false,"hovermode":"closest"},"source":"A","config":{"showSendToCloud":false},"data":[{"x":[7,10,9,10,6,7],"z":[51,61,52,65,60,48],"y":[30,31,25,35,42,27],"mode":"markers","type":"scatter3d","marker":{"color":"rgba(31,119,180,1)","line":{"color":"rgba(31,119,180,1)"},"showscale":false},"error_y":{"color":"rgba(31,119,180,1)"},"error_x":{"color":"rgba(31,119,180,1)"},"line":{"color":"rgba(31,119,180,1)"},"frame":null},{"colorbar":{"title":"p_age_shoe<br />m_shoes","ticklen":2},"colorscale":[[0,"lightgrey"],[1,"lightgrey"]],"showscale":false,"x":[7,10,9,6],"z":[[51,60,57,48],[52,61,58,49],[46,55,52,43],[56,65,62,53],[63,72,69,60],[48,57,54,45]],"y":[30,31,25,35,42,27],"mode":"markers","type":"surface","frame":null}],"highlight":{"on":"plotly_click","persistent":false,"dynamic":false,"selectize":false,"opacityDim":0.2,"selected":{"opacity":1},"debounce":0},"shinyEvents":["plotly_hover","plotly_click","plotly_selected","plotly_relayout","plotly_brushed","plotly_brushing","plotly_clickannotation","plotly_doubleclick","plotly_deselect","plotly_afterplot","plotly_sunburstclick"],"base_url":"https://plot.ly"},"evals":[],"jsHooks":[]}</script>
]
]

---
### Linear dependence

.pull-left[
But if we plot predicted height as a function of age in years and *age in months*, we get a line again, instead of a plane.

That's because age in years and age in months are **linearly dependent**: we can write one as a linear combination of the others; that is, age in months = 12 `$\times$` age in years.

We don't get any extra information from knowing age in months that we didn't already have from age in years.
]

.pull-right[
.center[
<div id="htmlwidget-3ed1330c82c6ca6570dc" style="width:400px;height:400px;" class="plotly html-widget"></div>
<script type="application/json" data-for="htmlwidget-3ed1330c82c6ca6570dc">{"x":{"visdat":{"f3586e929134":["function () ","plotlyVisDat"]},"cur_data":"f3586e929134","attrs":{"f3586e929134":{"x":{},"z":{},"y":{},"mode":"markers","alpha_stroke":1,"sizes":[10,100],"spans":[1,20],"type":"scatter3d"},"f3586e929134.1":{"x":{},"z":{},"y":{},"mode":"lines","alpha_stroke":1,"sizes":[10,100],"spans":[1,20],"type":"scatter3d","color":["lightgrey"],"inherit":true}},"layout":{"width":400,"height":400,"margin":{"b":40,"l":60,"t":25,"r":10},"scene":{"yaxis":{"title":"age in years"},"xaxis":{"title":"age in months"},"zaxis":{"title":"height"},"camera":{"up":{"x":0,"y":0,"z":1},"eye":{"x":2.5,"y":0.01,"z":0.2},"center":{"x":0,"y":0,"z":0.2}}},"showlegend":false,"hovermode":"closest"},"source":"A","config":{"showSendToCloud":false},"data":[{"x":[360,372,300,420,504,324],"z":[54.9181494661922,55.6672597864769,51.1725978647687,58.6637010676157,63.9074733096085,52.6708185053381],"y":[30,31,25,35,42,27],"mode":"markers","type":"scatter3d","marker":{"color":"rgba(31,119,180,1)","line":{"color":"rgba(31,119,180,1)"}},"error_y":{"color":"rgba(31,119,180,1)"},"error_x":{"color":"rgba(31,119,180,1)"},"line":{"color":"rgba(31,119,180,1)"},"frame":null},{"x":[300,324,360,372,420,504],"z":[51.1725978647687,52.6708185053381,54.9181494661922,55.6672597864769,58.6637010676157,63.9074733096085],"y":[25,27,30,31,35,42],"mode":"lines","type":"scatter3d","marker":{"color":"rgba(211,211,211,1)","line":{"color":"rgba(211,211,211,1)"}},"textfont":{"color":"rgba(211,211,211,1)"},"error_y":{"color":"rgba(211,211,211,1)"},"error_x":{"color":"rgba(211,211,211,1)"},"line":{"color":"rgba(211,211,211,1)"},"frame":null}],"highlight":{"on":"plotly_click","persistent":false,"dynamic":false,"selectize":false,"opacityDim":0.2,"selected":{"opacity":1},"debounce":0},"shinyEvents":["plotly_hover","plotly_click","plotly_selected","plotly_relayout","plotly_brushed","plotly_brushing","plotly_clickannotation","plotly_doubleclick","plotly_deselect","plotly_afterplot","plotly_sunburstclick"],"base_url":"https://plot.ly"},"evals":[],"jsHooks":[]}</script>
]
]

---
### In "real life"

Linear dependence will come into play when we cover regression. In the meantime, we can practice some regression matrix notation! You may be used to seeing a linear regression equation written out like this:

Well, that's the same thing as:

where
`$\;\boldsymbol{\beta} = \begin{bmatrix} \beta_0 \\ \beta_1 \\ \beta_2 \end{bmatrix}_{3 \times 1}\;$` and `$\;\mathbf{x}_i = \begin{bmatrix} x_{i0} \\ x_{i1} \\ x_{i2} \end{bmatrix}_{3 \times 1}$` where `$x_{i0} = 1$` for all i

<br>

.center[
#### Try multiplying it out by hand, first making sure the dimensions are compatible (notice the necessity to transpose `$\;\boldsymbol{\beta}$`)!
]

Stay tuned for more on this when we cover regression analysis 😁

---
class:middle

background-size:cover

# More resources

[**This**](https://www.khanacademy.org/math/precalculus/precalc-matrices) whole section has a lot of great information and practice with matrices.

For a more advanced introduction, work through the sections on vectors, linear combinations, and linear dependence [**here**](https://www.khanacademy.org/math/linear-algebra/vectors-and-spaces).

You can also pick and choose from the videos [**here**](https://www.khanacademy.org/math/linear-algebra/matrix-transformations), particularly those on functions and linear transformations.
]

---

class:middle,center
.left[
# Part 2: Functions, data transformations and a sprinkle of calculus
]

#### Now that we've explored how data is stored, let's look at how data can be transformed

---
### Logarithms

.left-code[
If you see `$\log(x)$` in this class, or basically anywhere in probability and statistics, it will refer to the natural logarithm, or `$\ln(x)$`.

#### What do you notice about the function `$\log(x)$`?

You can only "log" a positive number. Something like `$\log(-1)$` is undefined. We can see that `$\lim_{x \to 0}\log(x) = -\infty$`. Importantly, `$\log(1) = 0$`, so `$\log(x)$` for any `$x$` between 0 and 1 will give you a negative number.
]

<img src="math_review_files/figure-html/logGraph-1.png" width="100%" style="display: block; margin: auto;" />
]

---
### Exponentiation
.left-code[
The **inverse** of the natural log is the natural expontial function `$e^x$`, which we also write as `$\exp(x)$`:

so if one side of an equation is exponentiated, we can always "get out of it" by applying a logarithm to both sides, and vice versa.

#### But we can't do that if x is negative! Recall that we can't take a log of a negative number, and exponentiating a number is never going to give us anything negative.
]

<img src="math_review_files/figure-html/expGraph-1.png" width="100%" style="display: block; margin: auto;" />
]

---
### Rules to live by

When we have a sum inside an exponent, we can decompose this into the product of two exponents:
`$$\exp(a + b) = \exp(a)\exp(b)$$`
Similarly, a product inside a logarithm can be written as a sum of logs:
`$$\log(ab) = \log(a) + \log(b)$$`
Of course, the same is true of the inverses of addition and multiplication, subtraction and division:
`$$\exp(a - b) = \frac{\exp(a)}{\exp(b)}$$`
`$$\log\left(\frac{a}{b}\right) = \log(a) - \log(b)$$`

#### You will need to know these rules to understand various regression models in this class, I promise!

---

### Why do we care?

The log function takes a number `$x$` constrained to be `$>0$`, and transforms it onto an unbounded number space

- Probability: ranges from 0 to 1 (strictly constrained)
- Say we know `$Pr[likesdogs=1]$` is a function of age and height: how do we take 2 numbers, both outside the range of 0 to 1, and get a probability that is strictly between 0 and 1?

#### The logit function is here to help!
]
---

### Odds and probabilities
.pull-left[
Let's break it down. Think about flipping a coin lots of times: heads you win, tails you lose.

- A **probability** describes the number of successes out of the total number of trials (a proportion)
- An **odds** describes the number of successes compared the the number of failures (a ratio)

Let's say you get 4 heads out of 10 flips:

- probability = `$\frac{4}{10}$`
- odds = `$\frac{4}{6}$`

#### These are really different numbers!

]

`$prob = \frac{odds}{1 + odds}$`
]
]

---
### Logits and expits

The **logit** of `$p$`, aka the **log-odds** of `$p$`, can take a number between 0 and 1 (like a probability!) and transform it to a number between `$-\infty$` and `$\infty$`.

We can invert it to get the **expit** function, which can take any number on the real line and transform it to a value between 0 and 1:

]

<img src="math_review_files/figure-html/logitGraph-1.png" width="100%" style="display: block; margin: auto;" />
]
---

### Back to the data example:

```
##    age height likesdogs  yob
## 1   25     67         0 1995
## 2   38     63         1 1982
## 3   35     66         1 1985
## 4   36     70         1 1984
## 5   23     62         1 1997
## 6   26     73         1 1994
## 7   44     68         0 1976
## 8   23     70         1 1997
## 9   25     70         1 1995
## 10  41     59         0 1979
```

---

### Back to the data example:

Consider the following logistic regression model for the predicted probability that someone likes dogs ( `$p_i$` ) given their age in years ( `$x_i$` ):

`$\widehat{\log\left(\frac{p_i}{1-p_i}\right)} = 0.4 + 0.07*age_i - 0.005*height_i$`

#### What's the probability that person 4 likes dogs?

Log-odds: `$\log\left(\frac{p_4}{1-p_4}\right) = 0.4 + 0.07*36 - 0.005*70 = 2.57$`

Odds: `$\frac{p_4}{1-p_4} = \exp\left(2.57\right) = 13.066$`

Probability: `$p_4 = \frac{13.066}{1+13.066} = \frac{\exp\left(0.4 + 0.07*36 - 0.005*70\right)}{1 + \exp\left(0.4 + 0.07*36 - 0.005*70\right)} = 0.929$`

]

.right-col[
![](https://hips.hearstapps.com/ghk.h-cdn.co/assets/17/30/1500925839-golden-retriever-puppy.jpg)

##### Q: What's the probability of liking dogs of a person who's 20 years old and 67 inches tall?
]

---
class: inverse, middle, center

# A quick dip into calculus

#### More intuition - less mechanics

---

### Derivatives

.left-code[
The basic idea of a derivative is that it describes the rate of change of a function. If the function we're looking at is `$g(x)$`, then there are a couple of ways we usually notate the first derivative of `$g(x)$`, which we'll use interchangeably:
`$$g'(x) = \frac{d}{dx}g(x)$$`
Both equivalently tell us that we are looking at the function `$g(x)$` and taking the first derivative with respect to the variable `$x$`. That means that as `$x$` changes, we want to know how much `$g(x)$` changes. This is just the slope of `$g(x)$` at a given value of `$x$`.
]

#### Where is `$g'(x)$` greatest?

]
]

---

### More derivatives
.left-code[
.center[
#### Where is `$g'(x)$` greatest?

<img src="math_review_files/figure-html/slope2-1.png" width="100%" style="display: block; margin: auto;" />
]
]

When the first derivative is 0, the function may have reached its maximum or minimum.

So if you want to maximize a function, one way to do so is to differentiate it and then set it equal to 0.
]

---
### Integrals

An integral tells us how much cumulative space a function is covering (in terms of distance from the `$x$`-axis) as `$x$` gets larger.
]

.right-plot[
In the graph of `$f(x) = 2x^3 + 3x^2 + 4$` below, the area in blue is represented by the integral
`$$\int_{-2}^{2} 2x^3 + 3x^2 + 4 \; dx$$`
<img src="math_review_files/figure-html/index-3-1.png" width="70%" style="display: block; margin: auto;" />
]

---

### Integrals

The values at the top and bottom of the integral sign are those between which we're computing the integral. We could integrate over the whole function, from `$- \infty$` to `$\infty$`, or choose other limits of integration:

#### What limits of integration are displayed here?
]

The integral of a non-negative function increases with `$x$`:

`$$\int_{-2}^{-1} 2x^3 + 3x^2 + 4 \; dx$$`

`$$\leq \int_{-2}^{0} 2x^3 + 3x^2 + 4 \; dx$$`

`$$\leq\int_{-2}^{1} 2x^3 + 3x^2 + 4 \; dx$$`

`$$\leq \int_{-2}^{2} 2x^3 + 3x^2 + 4 \; dx$$`

and so on. On the graph, the area under the curve can only accumulate area, so the integral evaluated at greater and greater upper limits can only increase.

]

---
### Data example

Let's pretend we grabbed a larger sample from the population where we are examining age, height, affinity for dogs and year of birth:

```r
set.seed(6789)
n <- 1000
dat <- data.frame(
  age = round(runif(n, 22, 45)), height = round(rnorm(n, 66, 4)), likes_dogs = rbinom(n, 1, .53)
)
dat$yob <- 2020 - dat$age

summary(dat)
```

```
##       age            height        likes_dogs         yob      
##  Min.   :22.00   Min.   :55.00   Min.   :0.000   Min.   :1975  
##  1st Qu.:28.00   1st Qu.:63.00   1st Qu.:0.000   1st Qu.:1981  
##  Median :33.00   Median :66.00   Median :1.000   Median :1987  
##  Mean   :33.12   Mean   :66.13   Mean   :0.524   Mean   :1987  
##  3rd Qu.:39.00   3rd Qu.:69.00   3rd Qu.:1.000   3rd Qu.:1992  
##  Max.   :45.00   Max.   :80.00   Max.   :1.000   Max.   :1998
```

---
### Data example
.center[
__The distribution of height in inches__:
<img src="math_review_files/figure-html/unnamed-chunk-22-1.png" width="70%" style="display: block; margin: auto;" />

##### Q: Roughly what proportion of the sample is <70 inches tall? What's an expression for this quantity in terms of the integral of the PDF of height?

]

---
class:middle

background-size:cover

### More resources

.pull-left[
[**Here**](https://www.khanacademy.org/math/calculus-home/taking-derivatives-calc) is a **lot** of information about derivatives. You don't need more than the first few videos.

Same with [**this**](https://www.khanacademy.org/math/calculus-home/integral-calculus/definite-integrals-intro-ic) intro to integrals.

Watch the video on antiderivatives and indefinite integrals from [**this**](https://www.khanacademy.org/math/calculus-home/integral-calculus/indefinite-integrals) page and some of those on [**this**](https://www.khanacademy.org/math/calculus-home/integral-calculus/fundamental-theorem-of-calculus-ic) page to understand the link between derivatives and integrals.
]