If you collaborated with anyone, you must include “Collaborated with: FIRSTNAME LASTNAME” at the top of your lab!
1a. (3 points) The hard-threshold function is defined as \[f_\lambda(x) = \begin{cases} x & |x| \geq \lambda\\ 0 & |x| < \lambda \end{cases}\] Write an R function that takes two parameters, numeric input x
and a threshold lambda
. Your function should return the value of \(f_\lambda(x)\) and work for vector input x
of any length.
1b. (1 point) Set \(\lambda = 4\), demonstrate your function on the vector c(-5, -3, 0, 3, 5)
.
(Hint: the output should be the vector -5, 0, 0, 0, 5
)
1c. (1 point) Set \(\lambda = 2\), demonstrate your function on the vector c(-7, -5, -3, 0, 3, 5, 7)
.
2a. (3 points) The soft-threshold function is defined as \[g_\lambda(x) = \begin{cases} sign(x)(|x| - \lambda) & |x| \geq \lambda\\ 0 & |x| < \lambda \end{cases}\] Write an R function that takes two parameters, numeric input x
and a threshold lambda
. Your function should return the value of \(g_\lambda(x)\) and work for vector input x
of any length.
2b. (1 point) Set \(\lambda = 4\), demonstrate your function on the vector c(-5, -3, 0, 3, 5)
.
(Hint: the output should be the vector -1, 0, 0, 0, 1
)
2c. (1 point) Set \(\lambda = 2\), demonstrate your function on the vector c(-7, -5, -3, 0, 3, 5, 7)
.
Many popular functions in R output lists in order to return multiple objects of different types and lengths. Here we will look at the function lm
, which performs linear regression.
First, run the following code to create an object of class lm
.
linearMod <- lm(dist ~ speed, data = cars)
3a. (2 points) What are the names of the items in the list linearMod
?
3b. (4 points) Store the coefficients
within linearMod
as a new variable. What are the coefficients and their interpretations?
For problem 5, we will use an adapted version of the weather data from Tidyverse.
library(kableExtra)
weather <- data.frame("station" = rep(c("A", "B", "C"), each = 4),
"element" = rep(c("temp_min", "temp_max"), 2),
"month1" = c(11.4, 25.6, NA, NA, 17.7, 28.0,
NA, NA, 20.0, 24.9, NA, NA),
"month2" = c(NA, NA, 16.8, 28.7, NA, NA,
11.1, 26.8, NA, NA, 14.7, 33.4))
kable_styling(kable(weather))
station | element | month1 | month2 |
---|---|---|---|
A | temp_min | 11.4 | NA |
A | temp_max | 25.6 | NA |
A | temp_min | NA | 16.8 |
A | temp_max | NA | 28.7 |
B | temp_min | 17.7 | NA |
B | temp_max | 28.0 | NA |
B | temp_min | NA | 11.1 |
B | temp_max | NA | 26.8 |
C | temp_min | 20.0 | NA |
C | temp_max | 24.9 | NA |
C | temp_min | NA | 14.7 |
C | temp_max | NA | 33.4 |
4. (4 points) Use kable()
to present a tidied up version of this data, without any NA values. Do you prefer the tidied version? Why or why not? (Hint: My table has 4 variables, 6 observations, and no NA
values. I only kept one of the variables unchanged.)