R Coding
Homework 4
Your name
Honor code:
I have neither given nor received unauthorized assistance on this assignment. Type your initials here.
I receive help from “” and give help to “”.
Problem 1 Convert mtcars data to a tibble and add one variable for the car brand and one variable for car model. Store the new tibble as mtcars1. Use barplot to show the distribution of car brand.
Problem 2 The 2 × 2 table store the number of cases in a study of diabetes.
male female Have diabetes 20 10 Donot have diabetes 10 29
The data is inputed as t1 <- tibble(diabetes=c("yes","no"), male=c(20,20), female=c(10,29)) Tidy the simple tibble below.Do you need to make it wider or longer? What are the variables? Problem 3 Load table2 from tidyverse package. Compute the rate for table2. You need to do the following operations: 1. tidy table 2; 2. calculate the rate by dividing cases by population; 3. create a new tibble with additional variable: rate. Problem 4 Load the who data from tidyverse. Read the following meta info for the column names: a. The first three letters of each column denote whether the column contains new or old cases of TB. In this dataset, each column contains new cases. b. The next two or three letters describe the type of TB: rel stands for cases of relapse; ep stands for cases of extrapulmonary TB; sn stands for cases of pulmonary TB that could not be diagnosed by a pulmonary smear (smear negative); sp stands for cases of pulmonary TB that could be diagnosed be a pulmonary smear (smear positive). 1 c. The sixth letter gives the sex of TB patients. The dataset groups cases by males (m) and females (f). d. The remaining numbers gives the age group. The dataset groups cases into seven age groups: 014 = 0 14 years old; 1524 = 15 24 years old; 2534 = 25 34 years old; 3544 = 35 44 years old; 4554 = 45 54 years old; 5564 = 55 64 years old; 65 = 65 or older. Tidy the data: 1. Longer the tibble by converting the column names new_sp_m014:newrel_f65 to values of a variable (for example key), 2. Find and correct the inconsistency, i.e., newrel -> new_rel
3. Extract type, sex, and age information from key variable and create a variable for each of them. Also, remove redundant variables.
Problem 5 Extract the number of cases for each year. Make a informative plot to compare the number of cases in different years. Extract the number of cases for each type of patients (Types: rel, ep, sn, sp). Make a informative plot to compare the number of cases for different types.
Problem 6 Convert the type, sex, age into factors and label their levels properly. For example, the levels for sex are Male and Female and levels for ages are 014, . . .
Problem 7 Use boxplot to show the distribution of cases for each type. Use boxplot to show the distribution of cases for each combination of age group and sex. (Hint: use formula ~type or ~ age + sex)
2