INFT216 - Data Science 191
Assignment 4: Finding customer clusters at BondTelco
Assessment Value: 10%
Due Date: End of Week 12, Sunday 7th April 5pm (submit via iLearn)
Previously, management have been quite directed about what they were looking for and what they
wanted to predict. Now, they are asking you what kind of things can be ‘discovered’ from the data.
In particular they are interested in whether there are any kind of natural groupings that exist within
their customer base.
You have access to the same data as Assignment 1.
COLLEGE : Is the customer college educated?
INCOME : Annual income
OVERAGE : Average overcharges per month
LEFTOVER : Average % leftover minutes per month
HOUSE : Value of dwelling (from census tract)
HANDSET_PRICE : Cost of phone
OVER_15MINS_CALLS_PER_MONTH : Average number of long (>15 mins) calls
per month
AVERAGE_CALL_DURATION : Average call duration
REPORTED_SATISFACTION : Reported level of satisfaction
REPORTED_USAGE_LEVEL : Self-reported usage level
CONSIDERING_CHANGE_OF_PLAN : Was customer considering changing
his/her plan?
LEAVE : Whether customer left or stayed
Your goal is to find and explain any one natural grouping you find within the data. You only need to
concentrate on finding one way to group customers, and then explain that grouping in business
terms.
Deliverables:
Your final deliverables will be 2 PDF files, both produced by the same .Rmd script (with different
code chunk options). You must submit:
PDF1 - all code and results shown (like you would share with a colleague on the Data Science team)
PDF2 - only show those things necessary to help support management decision making (this is the
one you send to management!)
These will both be submitted online through iLearn.
Guidance:
Document
o Focus on a good document structure and layout (revisit week 2 on repeatable
research)
o Hint: Think about the headings in the document you produce
Focus on letting the visualizations do the talking. Only include explanatory text where it is
really necessary… although you should remember that management do not really
understand data science, so you will need to find a tradeoff between understandability and
verbosity. Verbose assignments will be penalized.
o You will need to use visualizations
o You will need to explain at least 1 of the clusters - and it must be useful explained
from a business perspective
Note:
As is the case with all assignments I set, if you do the minimum (correctly), then you will receive half
marks. Additional marks are awarded for those assignments where you have clearly put in
additional thought, whether it be in visualization, modelling, succinctness, or coding elegance.
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。