Statistics/ Stata
Part II. Stata
A. Make a New Variable (using what you learned in Lab A)
Use gss2016.dta
1. Use the continuous variable age to create a new ordinal variable called age_cat. The new variable should be coded as follows: age_cat = 1 if the person is 18-32 years old; age_cat=2 if the person is 33-67 years old; age_cat = 3 if the person is 68 years old or older. Use the method you learned called generate and recode.
a. Copy down the commands you used to create the new variable here:
b. Run a tabulate on the new variable and cut-n-paste the new output here:
B. From the Highlights Video and readings
Use gss2016.dta
1. Use the tab command on the variable astrosci. Cut-n-paste the command and results/output here:
2. The variable astrosci measures the degree to which the respondent thinks astrology is scientific.
a. What level-of-measurement is astrosci?
b. Does it make sense to use the mean value of astrosci as a measure of central tendency? Why or why not? If so, what is the mean value?
c. Does it make sense to use the median value of astrosci as a measure of central tendency? Why or why not? If so, what is the median value?
d. Does it make sense to use the mode value of astrosci as a measure of central tendency? Why or why not? If so, what is the mode value?
3. Use it to find the mean, the median, and the standard deviation. The variable hrs1 measures the number of hours the respondent worked in the past week. Use the summarize command on hrs1. Cut-n-paste the command and output/results here:
4. What is the mean number of hours worked the previous week among this sample?
5. Interpret the sd in a sentence like we did in the lecture part of class.
Use ANES2016.dta
The ANES2016 data is the 2016 American National Election Survey. This survey is representative of all voters in the 2016 election. This survey uses the feelings thermometer survey tool to gauge the respondents feelings towards certain people, groups, or issues. Respondents are asked to express their feelings in terms of temperature, with 0 indicating very cold and 100 indicating very warm. Ratings below 50 indicate relative dislike/negative and ratings above 50 indicate relative like/positive.
V161086: feeling thermometer toward Hillary Clinton
V161093: feeling thermometer toward Bill Clinton
1. Use the summarize command on V161086 and then on V161093. Note that we are including some negative values, but thats OK for now, just ignore it). Cut-n-paste the two commands and results/output here:
2. We will use the summarize command on both of these variables, but we want to see more than the typical summarize output. So we will tell Stata to give us more detail using the detail option. Use the following commands summarize V161086, detail and summarize V161093, detail mean, standard deviation, range, and values for the 25th, 50th, and 75th percentile. Cut-n-paste the commands and results/output here:
3. Using the mean and median, write a few sentences that describe how, on average, voters feel about Bill and Hillary. Be sure to indicate who voters are more positive about Bill or Hillary?
4. The value at the 25th percentile tells us that 25% of the voters gave responses that are lower than that value.
a. What is the value feelings thermometer value at the 25th percentile for Hillary?
b. What is the value feelings thermometer value at the 25th percentile for Bill?
c. Write a few sentences that compares the 25th percentile feelings thermometer values for Bill and Hillary? What does that suggest about differences between voters in their feelings towards Bill and Hillary.
C. Summarize
1. Make a list of the Stata commands used to analyze:
a. Continuous/interval-ratio videos
b. Categorical variables binary
c. Categorical variables ordinal
d. Categorical variables nominal