Your tasks are to answer the following research questions given in Section 2 to Section 6 below using dataset 1 or dataset 2 as indicated in each section. To answer each question, you will need to first present the relevant numerical summary (summary statistics) and graphical display and perform suitable statistical analysis to make inferences and to provide a conclusion.
You need to submit a word document file which shows all computer outputs (numerical summary & graphs) and discussion
1. Section 1: Introduction
Provide a brief and clear introduction about the report (e.g. the objective of the report, the datasets involved, etc.). Find 1-3 articles (minimum one article, maximum three articles) which are relevant to any of the research questions (given in Section 2 Section 6) and write a proper literature review. Your literature review should include in-text citation and you will need to add a reference list at the end of your report.
2. Section 2: Did around 25% of taxpayers in 2013-2014 self-prepare their tax return?
Using Dataset 1, first provide both numerical summary and graphical display that easily shows the proportion of each lodgement method.
Then, construct a 95% confidence interval of the population proportion of Self-preparer lodgement method.
Finally, answer the research question using the confidence interval.
3. Section 3: Is the average salary of taxpayers in 2013-2014 less than $45,000?
Using Dataset 1, first describe the salary amount distribution of Australian taxpayers in 2013-2014. You need to provide numerical summary (sample size, mean, standard deviation and median) as well as graphical display which shows any outliers.
Then perform a suitable hypothesis test to answer the research question above at 5% level of significance.
4. Section 4: For the total income between 75000 80000, is there a difference in the total deduction between different lodgement method?
Using Dataset 1, , first filter the total income to include only the income between $75,000 to $80,000 (inclusive). Then provide the numerical summary for the total deduction grouped by different lodgement method. You also need to provide graphical display which shows any outliers.
Then, perform a suitable hypothesis test to answer the research question above. Use a 5% significance level.
5. Section 5: Is there any relationship between total income amount and total deduction amount for self-preparer?
Using Dataset 1, first filter the data to include only self-prepare lodgement method, then describe the relationship between total income amount and total deduction amount using a suitable graph.
Then, perform a regression analysis and provide the numerical summary.
Finally, interpret the correlation coefficient, the coefficient of determination and the relevant p-values and use them to answer the research question.
6. Section 6: Is there a relationship between the gender of international students and whether they would use a tax agent?
Using Dataset 2, describe the relationship between the gender of international students and whether they would use a tax agent. You need to provide both numerical summary and graphical display.
Then, perform a suitable hypothesis test to answer the research question above. Use a 5% significance level.
7. Section 7: Conclusion
Write a summary of all the findings in the previous sections and then write concluding statements that would benefit a stake holder (e.g. tax agents, tax payers) to take management action.
Finally, suggest further research by discussing an interesting topic or a research question that can be further explored related to the datasets and/or the findings.