• QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp2

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2021-04-13 10:17

ETC5512: Assignment 1

Due date: 11.55pm, April 16, 2021

Learning objectives

This assignment is designed to assess whether you

have developed an understanding of the limitations of various types of data collection.

can utilise open data sources, by accessing two different formats, with the purpose of extracting data to solve a problem.

write a reproducible report to communicate your solution to a problem, in an informative and readable manner.

?? You are very young in your studies in the MBAt, and we don’t expect you to be a master of R, or a genious data analyst, yet. This is a first step in that journey, and we would like you to focus on thinking about the problem being tackled, communicate your fresh ideas for tackling the problem, focus on a very simple analysis, primarily summary statistics and plots of the data. Re-visiting the material from lectures and tutorials thus far about data collection methods, and extracting open data, will help you get started.

?? Turn-in

Please use assignment1_template.zip on Moodle as template. Produce a reproducible report of maximum 1000 words (about four pages)1 and submit it as a “zip” file containing the single html file and a single Rmd file that is self-contained and compiles without error when placed in the right location in the project structure given in the template ptovided. The html file you submit should be the result of compiling your Rmd file. Your Rmd should be named as FamilyName-GivenName.Rmd where FamilyName and GivenName replaced with your family name and given name, respectively. If you have a middle name or preferred name that you would like to include, please add these in between FamilyName and GivenName separated by a hyphen. You should include your data, in the data directory, BUT this should be a subset of the full BTS data containing only the records for the two airports. The reason is that the full downloaded data is too big for easy upload and download from moodle. The ALA data set can also be reduced in size to contain just the information necessary for studying the assigned problem.

This assignment is worth 25 marks in total. The assignment is marked on the quality of the report and the quality of the analysis. This is an individual assignment and the report that you submit for assessment must be your own work.

?? Task

You are commissioned as an independent Business analyst consultant for the chief data officer for Qantas2 to write a report comparing the efficiency of using DFW or LAX as the primary airport into and out of the USA, and to assess two locations in Victoria, Tullamarine and Bendigo, for a new plane storage facility that has the least impact on local endangered species.

For the first task, you will use the Bureau of Transportation Statistics aviation ontime performance database (https://www.transtats.bts.gov). You should be able to use the data downloaded during tutorials. (There is no need to make your tasks more complicated by using data like passenger numbers, fuel consumption or weather.)

For the second task, you need to download data from the Atlas of Living Australia, containing occurrence records within a 50km radius of each of the airports, for records dating back to Jan 1, 2000. You are asked to focus on the species list provided, which are citically endangered in Victoria. (Keep it simple, this is purely an impact statement on wildlife, not on the terrain or physical conditions of sites, so you only need to extract occurrence data.)

Advice on downloading from ALA:

I found it easier to download directly from the web site than to use the the R package ALA4R

Go to the “Search & Analyse” then “Search and download records” and select “Batch taxon records”

Cut and paste the list of taxa below

Species being considered

Anthochaera (Xanthomyza) phrygia

Thinornis cucullatus

Perameles gunnii

Petauroides volans

Petrogale penicillata

Neophema (Neonanodes) chrysogaster

Ornithorhynchus anatinus

Use “Education” as the reason

You will get an email, once your subset is ready. This is usually within 5 minutes. Click the link in the email to get the download.

Look at the files that have arrived with your data. There is a DOI3 There is also a citation file containing details of sources of the data and how to appropriately cite them - there might be a lot for your subset so for this assignment exercise, you can skip the citations.

Your report is written for the Qantas chief data officer, and although your report should not contain codes, the analysis and assumptions necessary to make should have an explanation that will make an impression on someone with a technically proficient background.

Note that, there is not one correct answer. It is more important to have clear explanation justifying your answer.

?? Analysis

Task 1: Airport efficiency (total of 8 marks)

Your analysis should contain the following elements:

(3pts) Computed summary statistics for both airports to compare and contrast their operations.

(5pts) Several (at least two) plots, that compare and contrast the two airports.

Task 2: Impact on wildlife (total of 8 marks)

You analysis should contain the following elements:

Summary statistics comparing both locations.

Two maps, showing occurrences at both locations

At least one other plot comparing and contrasting the two locations

Hint: You’ll need to think about the type of data collection that is ued for the Atlas. For example, if there are no records at a particular site, does it mean that there are no species living at that location?

?? Report (total of 9 marks)

The report should satisfy the following criteria:

Two main sections each of length two pages, approximately, detailing your findings.

(3pts) A summary paragraph containing what you have learned about the problem: (1) the relative efficiency of the two airports, (2) impact on wildlife.

(1pt) A summary (possibly) a table of terms used in the analysis, and how different quantities were calculated. Clearly define how you are defining efficiency (is it delays, is it number of connecting flights into and out of the airport) and impact (number of one species, variety of species).

(2pt) A section describing the data, including (i) an overview of the database, (ii) why your analysis is an acceptable use of the open database, (iii) the samples you have used for the analysis should allow you to make inference (or not) more broadly.

(1pt) Detailed and concise explanation of the methods used in the analysis, without showing the code in the report. (Note: code should be in the Rmd file, in sufficient quality to reproduce your work.) Discuss any limitations of your analysis and/or interpretations, possibly based on the samples you are working with.

(1pt) Appropriate referencing to all literature, software, and data sources in an academic referencing style. (This won’t count in the word limit.)

(1pt) Appropriate spelling grammar checks so that the report is high quality.

You can add an appendix or supplementary material, containing tables and plots that you find interesting but not important enough to include in the main report. (Note that this may NOT be read during marking.)

Additional resources

If you have questions about writing a report, please consult the Q-manual.


This is a glossary of the endangered spaecies included in this subset from the Atlas of Living Australia.

Regent Honeyeater - Anthochaera (Xanthomyza) phrygia

Hooded plover - Thinornis cucullatus

Eastern Barred Bandicoot - Perameles gunnii

Platypus - Ornithorhynchus anatinus

Greater Glider - Petauroides volans

Brush-Tailed Rock-Wallaby - Petrogale penicillata

Orange-Bellied Parrot - Neophema (Neonanodes) chrysogaster

Note that you can use the word counter in RStudio to check this limit with your document.??

This is just a hypothetical scenario for your assignment. You are not really commissioned by Qantas.??

Check the lecture notes on DOI to know what this means. This is a permanent link to the subset that you created that can be used to share your data and analysis with others.??



版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 电子信箱:99515681@qq.com