联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2024-01-04 09:05

ENVS363/563.3 - A Computational Essay 2023/24

Overview and Instructions

Due Date: 8th January 2024

50% of the final mark

Overview

Here’s the premise. You will take the role of a real-world GIS analyst or spatial data scientist tasked

to explore datasets on the San Francisco Bay Area (often just called the Bay Area) and find useful

insights for a variety of city decision-makers. It does not matter if you have never been to the Bay

Area. In fact, this will help you focus on what you can learn about the city through the data, without

the influence of prior knowledge. Furthermore, the assessment will not be marked based on how

much you know about the San Francisco Bay Area but instead about how much you can show you

have learned through analysing data. You will need contextualise your project by highlighting the

opportunities and limitations of ‘old’ and ‘new’ forms of spatial data and reference relevant

literature.

Format

A computational essay using Quarto. The assignment should be carried out fully in Quarto.

What is a Computational Essay?

A computational essay is an essay whose narrative is supported by code and computational results

that are included in the essay itself. This piece of assessment is equivalent to 4,000 words.

However, this is the overall weight. Since you will need to create not only narrative but also code

and figures, here are the requirements:

• Maximum of 1,000 words (ordinary text) (references do not contribute to the word

count). You should answer the specified questions within the narrative. The questions

should be included within a wider analysis.

• Up to five maps or figures (a figure may include more than one map and will only count

as one but needs to be integrated in the same overall output)

• Up to one table

There are three kinds of elements in a computational essay.

1. Ordinary text (in English)

2. Computer input (R code)

3. Computer output

These three elements all work together to express what’s being communicated.

Submission

You must submit 1 electronic copy of your assessment via Canvas by the published

deadline. The format of the file must be an html document. Please do not include your

name anywhere in the documents.

• Please refer to the ENVS363/563 Assessment criteria. This document includes the parts

you should include in your Computational Essay.

Data

The assignment relies on datasets and has two parts. Each dataset is explained with more detail

below.

ENVS363-563 Computational Essay

• Data made available on Murray Cox’s website as part of his “Inside Airbnb” project which

you can download (http://insideairbnb.com/). The website periodically publishes

snapshots of Airbnb listings around the world. You should Download the San Francisco

data, the San Mateo data and the Oakland data. These are all part of the Bay Area.

Please Note: that for best results you will need to drop some of the outliers.

• Socio-economic variables for the Bay Area. Source: American Community Survey (ACS)

2016-2020, US Census Bureau. Observations: 1039; Variables: 472; Years: 2016-2020.

o A subset of variables from the latest ACS has already been retrieved for you in

ACS_2016_2020_vars.csv. However, you have access to ALL variables in the

American Community Survey (ACS) 2016-2020 through the R package

Tidycensus.

o You are strongly recommended to use the census API in the R package

Tidycensus to extract your variables of interest instead of the csv. For more

information about the ACS (2016-2020) you can have a look at:

https://www.census.gov/data/developers/data-sets/acs-5year.html and

https://api.census.gov/data/2020/acs/acs5/variables.html.

If you want to visualise some aspects at different Subnational Administrative boundaries, you can

download USA boundaries from GADM. You can also find other geodata for the Bay Area in the

Berkeley Library.

IMPORTANT - Students of ENVS563 will need to source, at least, two additional datasets relating

to San Francisco or the Bay Area. You can use any dataset that will help you complete the tasks

below but, if you need some inspiration, have a look at the following:

• Geodata for the Bay Area in the Berkeley Library.

• San Francisco Open Data Portal: https://datasf.org/opendata/

• Data World: https://data.world/datasets/san-francisco

• NASA Data: https://earthdata.nasa.gov/earth-observation-data/near-real-time/hazardsand-disasters/air-quality

Part 1 – Common

1.1 Collecting and importing the data

1.1.1 Import and explore

1.2 Preparing the data

1.2.1 What CRS are you going to use? Justify your answer.

1.3 Discussion of the data

• Present and describe the data sets used for this project.

1.4 Mapping and Data visualisation

1.4.1 Airbnb in the BAY AREA at Neighbourhood Level

• Summarise the data. Using Bay Area zipcodes/ ZCTAs obtained from Berkeley Library.

This is slightly different from the Airbnb neighbourhood file. Obtain a count of listings by

neighbourhood.

ENVS363-563 Computational Essay

• Map 1.1: Number of listings per zipcode. Explore the spatial distribution of the data using

choropleths. Style the layers using a colour ramp.

• Map 1.2: Average price per zipcode. Explore the spatial distribution of the data using

choropleths. Style the layers using a colour ramp.

Justify your data classification methods and visualization choices. You should include these maps

in your assessment submission. The maps should be well-presented and include a short

description.

Questions to answer within your analysis: How does the Inside Airbnb data compare to other ‘new’

forms of spatial data? Discuss the potential insights and biases, as well as opportunities and

limitations of the Airbnb data.

1.4.2. Socio-economic variables from the ACS data

Select two variables from American Community Survey data. These could be but are not limited

to population density, median income, median age, unemployed, percentage of black population,

percentage of Hispanic population or education level. See the Appendix in this document for help.

If you chose to calculate population percentages, make sure you standardise the table by the

population size of each tract.

• Map2: Explore the spatial distribution of your chosen variables using choropleths. Style the

variables using a colour ramp. Justify your data classification methods and visualization

choices. You should include these maps in your assessment submission. The maps should

be well-presented and include a short description.

Questions to answer within your analysis. Comment on the details of your map and analyse the

results. What are the main types of neighbourhoods you identify? Which characteristics help you

delineate this typology? What can you say about the spatial distribution of your socio-economic

variable of interest? If you had to use this classification to evaluate where Airbnbs would cluster,

what would your hypothesis be? Why?

For some stylised (not necessarily accurate) facts about the Bay Area see here.

1.4.3. Combining Data sets

• Map 3: Plot the natural logarithm of price (ln of price) of Airbnbs in the San Francisco Bay

Area together (point plot) with one of your chosen socio-economic variables of interest

at zipcode level using ggplot or tmap or mapsf (polygon plot). There are various ways of

doing this. The maps should be well-presented.

Questions to answer within your analysis. Comment on the details of your map and analyse the

results. Does this map tell you more about the relationship between Airbnb location/price and

your socio-economic variable of choice? Explain your answer.

1.4.4. Autocorrelation

• Map 4: Explore the degree of spatial autocorrelation. Describe the concepts behind your

approach and interpret your results.

ENVS363-563 Computational Essay

Part 2 – Chose your own analysis

For this one, you need to pick one of the following three options. Only one, and make the most

of it.

Please Note: This part of the assignment can be done on the Bay Area as a whole or you can

zoom in on one of the counties. For example, you could just focus on San Francisco.

1. Create a geodemographic classification and interpret the results. In the process, answer

the following questions:

• What are the main types of neighbourhoods you identify?

• Which characteristics help you delineate this typology?

• If you had to use this classification to target areas in most need, how would you use it?

why?

2. Create a regionalisation and interpret the results. In the process, answer at least the

following questions:

• How is the city partitioned by your data?

• What do you learn about the geography of the city from the regionalisation?

• What would one useful application of this regionalisation in the context of urban policy?

3. Use the OpenStreetMap package to osmdata download Point of Interest (POIs) Data for

the Bay Area or San Francisco. Using this this data, complete the following tasks:

• Visualise the dataset appropriately and discuss why you have taken your specific

approach

• Use DBSCAN to identify areas of the city with high density of POIs, which we will call

areas of interest (AOI). In completing this, answer the following questions:

o What parameters have you used to run DBSCAN? Why?

o What do the clusters help you learn about areas of interest in the city?

o Name one example of how these AOIs can be of use for the city. You can take

the perspective of an urban planner, a policy maker, an operational

practitioner (e.g. police, trash collection), an urban entrepreneur, or any

other role you envision.

Resources to help you. See also suggested bibliography in slides throughout the course.

• https://www.r-bloggers.com/2017/11/programming-meh-lets-teach-how-to-writecomputational-essays-instead/

• https://rmarkdown.rstudio.com/

• https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf

• https://vizual-statistix.tumblr.com/post/114850050736/i-find-the-spread-of-airbnb-to-beas-fascinating

• https://carto.com/blog/airbnb-impact/

• https://cran.r-project.org/web/packages/biscale/vignettes/biscale.html

Appendix

American Community Survey (ACS) 2016-2020, US Census Bureau. Observations: 1039; Variables:

472; Years: 2016-2020

ENVS363-563 Computational Essay

Variable Description

B19013_001E Median household income in the past 12 months (in 2020 inflation-adjusted

dollars). Coded as hh_income

B02001 (list of vars) Population by race

See https://api.census.gov/data/2020/acs/acs5/variables.html

I have already recoded black (n of black people) and all_ppl_race (total

population by census tract)

B23006 (list of vars) Population by education

See https://api.census.gov/data/2020/acs/acs5/variables.html

C15002A (list of vars) Population by Sex by Education

See https://api.census.gov/data/2020/acs/acs5/variables.html

C27012 (list of vars) Population by Health insurance

See https://api.census.gov/data/2020/acs/acs5/variables.html

B08006 (list of vars) Commuting variable

See https://api.census.gov/data/2020/acs/acs5/variables.html

B09010 (list of vars) Supplementary income variables

See https://api.census.gov/data/2020/acs/acs5/variables.html

B09019 (list of vars) Household type counts

See https://api.census.gov/data/2020/acs/acs5/variables.html

B17001 (list of vars) Poverty Status

See https://api.census.gov/data/2020/acs/acs5/variables.html

B28011 (list of vars) Internet Access

See https://api.census.gov/data/2020/acs/acs5/variables.html

B99084 (list of vars) Work From Home

See https://api.census.gov/data/2020/acs/acs5/variables.html


相关文章

版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp