###### 日期：2021-04-13 10:17

ETC5512: Assignment 1

Due date: 11.55pm, April 16, 2021

Learning objectives

This assignment is designed to assess whether you

have developed an understanding of the limitations of various types of data collection.

can utilise open data sources, by accessing two different formats, with the purpose of extracting data to solve a problem.

write a reproducible report to communicate your solution to a problem, in an informative and readable manner.

?? You are very young in your studies in the MBAt, and we don’t expect you to be a master of R, or a genious data analyst, yet. This is a first step in that journey, and we would like you to focus on thinking about the problem being tackled, communicate your fresh ideas for tackling the problem, focus on a very simple analysis, primarily summary statistics and plots of the data. Re-visiting the material from lectures and tutorials thus far about data collection methods, and extracting open data, will help you get started.

?? Turn-in

This assignment is worth 25 marks in total. The assignment is marked on the quality of the report and the quality of the analysis. This is an individual assignment and the report that you submit for assessment must be your own work.

You are commissioned as an independent Business analyst consultant for the chief data officer for Qantas2 to write a report comparing the efficiency of using DFW or LAX as the primary airport into and out of the USA, and to assess two locations in Victoria, Tullamarine and Bendigo, for a new plane storage facility that has the least impact on local endangered species.

For the first task, you will use the Bureau of Transportation Statistics aviation ontime performance database (https://www.transtats.bts.gov). You should be able to use the data downloaded during tutorials. (There is no need to make your tasks more complicated by using data like passenger numbers, fuel consumption or weather.)

For the second task, you need to download data from the Atlas of Living Australia, containing occurrence records within a 50km radius of each of the airports, for records dating back to Jan 1, 2000. You are asked to focus on the species list provided, which are citically endangered in Victoria. (Keep it simple, this is purely an impact statement on wildlife, not on the terrain or physical conditions of sites, so you only need to extract occurrence data.)

I found it easier to download directly from the web site than to use the the R package ALA4R

Go to the “Search & Analyse” then “Search and download records” and select “Batch taxon records”

Cut and paste the list of taxa below

Species being considered

Anthochaera (Xanthomyza) phrygia

Thinornis cucullatus

Perameles gunnii

Petauroides volans

Petrogale penicillata

Neophema (Neonanodes) chrysogaster

Ornithorhynchus anatinus

Use “Education” as the reason

Look at the files that have arrived with your data. There is a DOI3 There is also a citation file containing details of sources of the data and how to appropriately cite them - there might be a lot for your subset so for this assignment exercise, you can skip the citations.

Your report is written for the Qantas chief data officer, and although your report should not contain codes, the analysis and assumptions necessary to make should have an explanation that will make an impression on someone with a technically proficient background.

Note that, there is not one correct answer. It is more important to have clear explanation justifying your answer.

?? Analysis

Task 1: Airport efficiency (total of 8 marks)

Your analysis should contain the following elements:

(3pts) Computed summary statistics for both airports to compare and contrast their operations.

(5pts) Several (at least two) plots, that compare and contrast the two airports.

Task 2: Impact on wildlife (total of 8 marks)

You analysis should contain the following elements:

Summary statistics comparing both locations.

Two maps, showing occurrences at both locations

At least one other plot comparing and contrasting the two locations

Hint: You’ll need to think about the type of data collection that is ued for the Atlas. For example, if there are no records at a particular site, does it mean that there are no species living at that location?

?? Report (total of 9 marks)

The report should satisfy the following criteria:

Two main sections each of length two pages, approximately, detailing your findings.

(3pts) A summary paragraph containing what you have learned about the problem: (1) the relative efficiency of the two airports, (2) impact on wildlife.

(1pt) A summary (possibly) a table of terms used in the analysis, and how different quantities were calculated. Clearly define how you are defining efficiency (is it delays, is it number of connecting flights into and out of the airport) and impact (number of one species, variety of species).

(2pt) A section describing the data, including (i) an overview of the database, (ii) why your analysis is an acceptable use of the open database, (iii) the samples you have used for the analysis should allow you to make inference (or not) more broadly.

(1pt) Detailed and concise explanation of the methods used in the analysis, without showing the code in the report. (Note: code should be in the Rmd file, in sufficient quality to reproduce your work.) Discuss any limitations of your analysis and/or interpretations, possibly based on the samples you are working with.

(1pt) Appropriate referencing to all literature, software, and data sources in an academic referencing style. (This won’t count in the word limit.)

(1pt) Appropriate spelling grammar checks so that the report is high quality.

You can add an appendix or supplementary material, containing tables and plots that you find interesting but not important enough to include in the main report. (Note that this may NOT be read during marking.)

If you have questions about writing a report, please consult the Q-manual.

Appendix

This is a glossary of the endangered spaecies included in this subset from the Atlas of Living Australia.

Regent Honeyeater - Anthochaera (Xanthomyza) phrygia

Hooded plover - Thinornis cucullatus

Eastern Barred Bandicoot - Perameles gunnii

Platypus - Ornithorhynchus anatinus

Greater Glider - Petauroides volans

Brush-Tailed Rock-Wallaby - Petrogale penicillata

Orange-Bellied Parrot - Neophema (Neonanodes) chrysogaster

Note that you can use the word counter in RStudio to check this limit with your document.??

This is just a hypothetical scenario for your assignment. You are not really commissioned by Qantas.??

Check the lecture notes on DOI to know what this means. This is a permanent link to the subset that you created that can be used to share your data and analysis with others.??

