联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2024-05-22 10:06

Visual Analytics Coursework Specification

Spring 2024

1. Overview

This coursework aims to give you experience of the whole lifecycle of carrying out a full

visual analytics project.

Your goals are:

• To follow a sound visual analytics process

• To develop a visualisation that displays important features of a dataset

• To write a clear report on your findings.

The outputs from this work should be

1. a Tableau dashboard and associate worksheets (as a packaged workbook: see

https://help.tableau.com/current/pro/desktop/enus/save_savework_packagedworkbooks.htm

);

2. a written report with sections as defined below.

The submission deadline is 13:00 on Wednesday 22

nd

May through Blackboard: create a

single zip file containing all the files in your submission. This coursework is worth 80% of the

marks for the unit.

2. Task Details

The task you are asked to carry out for the coursework is to design, construct, and evaluate

an exploratory analysis of a complex dataset using both information visualisation and data

projection. This dataset should be based on census data for England and Wales. You should

design the visualisation to address some socio-economic issues that is important to you.

You must submit at least two data projections using different algorithms. I expect that

you will do this work in Python (following the methods you have practiced in the labs) and for

each projection, create a matrix with two columns representing the two variables the data is

projected onto. If you save this matrix in a file (e.g. CSV format) it can then be imported

easily into Tableau and used in your visualisations. I want to review the Python code used to

generate the projections, so please include it in your submission. The purpose of data

projection is to show the data structure: clusters, outliers, and relationships between different

labels.

You may use data taken from the 2011 census in England and Wales which is indexed by

the Excel file 2011CensusIndexofTablesandTopics_v11_4_2.xlsx The tab labelled ‘All

Tables’ provides a list of tables and links to the underlying data. (I have found that the Excel

file links are valid, the NESS links don’t work as the server can’t be found, and the links to

NOMIS take you to a website where additional data can be downloaded.) You may find

Tableau’s Data Interpreter useful, and you may also need to edit some files to create usable

datasets.

There are more than 1600 tables in total: clearly, this is far too many to create an interesting

report. You should focus on a limited number of tables (probably around three or four) that

allow you to explore a particular aspect of socio-economic life in England and Wales: for

example, health and links to nationality or occupation.

A new census was carried out in 2021 (during the pandemic). Some of the results have been

released by the Office for National Statistics, but so far these have only been in certain

topics. A link to the topics that have been released can be found here

https://census.gov.uk/census-2021-results/phase-one-topic-summaries You should find that

you can click through on a topic to a map display https://www.ons.gov.uk/census/maps and

from here select a topic such as ‘Housing’. Selecting a variable changes the map and also provides a link to download the data for that variable. Perhaps simpler is to visit the bulk

downloads page https://www.nomisweb.co.uk/sources/census_2021_bulk

You need to use both data, the 2011 data and the 2021 data for at least one of your

visualisations.

Something to note: Some geographic definitions don’t necessarily match between the two

census dates. This site will help you manage this

https://www.ons.gov.uk/releases/censusmapsupdatechangeovertime

Your report should contain the following sections:

• Abstract. A brief description of the key points in the report.

• Introduction. The background of the problem.

• Data Preparation and Abstraction. Describe the data manipulation necessary to create

a dataset for analysis and the principal data types and semantics that you have

analysed.

• Task Definition. A description of the tasks using Munzner’s task taxonomy for which you

have created the visualisations.

• Visualisation Justification. Define the visualization techniques you use and justify your

choices. You should refer to the principles of info vis, relevant aspects of human

perception and cognition, and the scientific literature where appropriate. You should also

explain why you have chosen the data projection methods that you have used. This

justification and explanation is a very important assessment criterion, so do not skimp on

this and make sure that it is grounded in the theoretical concepts we have covered

during the course.

• Evaluation. Using appropriate levels and types of validation (as in Chapter 4 of

Munzner), assess the quality of your visualization by making appropriate measurements

and observations of the other students in your discussion group in an analytic task using

your visualisation. (The list of discussion groups is also available on Blackboard).

• Conclusion. I expect you to address two aspects.

• What you have learned about the socio-economic problem that was the basis of the

visualization.

• What you have learned about information visualisation from doing the coursework.

I am expecting the report to be about six to ten pages in length. This is an expectation, not a

strict limit, so there will be no penalty for exceeding it. But if you find yourself writing much

more than this, you are almost certainly providing too much detail. In particular, note that I

will see the visualisation you generate, so there should be little or no need for screenshots.

I use the term 'dashboard' in the Tableau sense of a set of visualisations on a single screen.

It is permissible to submit more than one Tableau dashboard or workbook if that supports

the task better. Do not feel you have to squeeze everything onto a single dashboard. You

may remember the system for visualising American census data that had every possible

graph interacting in lots of ways. It was just too crowded and complex to be useful.

Geocoding issues

It can be hard to plot the census data in Tableau because it does not contain outcode

information. This blog contains some geocoding packages and a video on how to use them

that support geographic information at many different levels of granularity. It should be

helpful for you.

You may have some problems with using geocoding packages, in which case this link to

Tableau help should be useful.

https://kb.tableau.com/articles/issue/error-the-custom-geocoding-folder-has-errors-whencreating-map

I have also provided a short guidance note written by Joshua Ramini on the Blackboard site.

3. Assessment

The assessment criteria are:

• Problem understanding: how well you have explained the goals of the tasks, taking

account of end-user requirements. (10 marks)

• Data preparation and task analysis: care taken over extracting and manipulating the

data; insights gained through the task analysis. (15 marks)

• Data visualisation: appropriateness of visualization and modelling approaches;

systematic use of statistical and visualisation methods; justification of visualization

approach used. (50 marks)

• Conclusions: what the user should learn from your analysis and what you have learned

about large-scale data visualisation. (15 marks)

• Presentation: fluency and coherence of the written text; quality of images and graphics

used. (10 marks)

Below are some general points that will help you when working on this coursework:

• Ensure that questions you set out to ask are answered by the visualisation and in the

report.

• Having the option of switching between absolute values and proportions is often a useful

feature. This is particularly helpful when comparing areas with different populations.

• When using dimensionality reduction it is important to communicate to the user which

variables were used in the original data space as otherwise, it is hard to interpret the

plots.

• Tooltips should identify the corresponding point (e.g. a location), particularly for projected

data.

• The introduction should contain some discussion of the type of user the visualization is

intended for.

• The report should note data anomalies (e.g. missing values) in the report, in particular,

quantifying the number of missing values, etc.

• The abstract should describe the main findings of the work.

• Data cleaning matters.

• The use of section and page numbers helps the reader to navigate the report.

• References to secondary literature are valuable tools to provide context.


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp