联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Database作业Database作业

日期:2019-04-08 10:27

ANALYTICS SPECIALIZATIONS & APPLICATIONS, 2019

COURSEWORK 1 - Customer Analytics Study

Final Report Deadline: Tuesday 9th Apr 2019, 11.59pm

Submission: Via the Analytics Specialization and Applications Moodle Site

Work Plan Deadline: Tuesday 26th Mar 2019, 11.59pm

Submission: Via the Analytics Specialization and Applications Moodle Site

1. The Problem Definition:

In this coursework you will perform a market segmentation on a transactional dataset that

has been provided by a national convenience store chain (4 files describing 3000 customers

over 6 months). Through analysis of the company’s point-of-sale data you will produce

profiles for 5-7 customer segments. These will include a statistical summary and a pen

profile for each segment (these would be used by the retailer to better understand their

customer-base and to inform future marketing campaigns). Your final deliverable in this task

will be a report angled as though it is to be read by the company’s chief data officer and

marketing director (and so which should be presented accordingly). This report will be

accompanied by your technical implementation (expected to be in the form of Python and/or

SQL scripts) and a csv file that list the customers assignments into the categories you have

generated (simply linking their customer ids to a segment id as indicated in your report).

2. Expected Approach:

Being a subjective process, this customer segmentation study has been deliberately left

relatively open ended. There are numerous ways of approaching a task of this sort and the key

is to simply justify the method you take. Nevertheless, it is broadly expected that you will

implement your segmentation study in three parts:

1. Exploration of the sample data provided to establish which indicators your believe are

important to describe customer behaviour (justifying your decisions);

2. A quantitative analysis, considering the customer base in light of selected/ engineered

features before then implementing a clustering approach to settle upon 5-7 customer

segments (justifying your chosen methodology);

3. An analysis of results in order to produce a tangible description of each cluster in the

form of a statistical summary, customer archetypes (pen profiles) and identification of

the two most attractive targets for the company.

You should also submit a very short bullet pointed work plan (of absolutely no more than one

page) to be submitted prior to implementation of the final task (deadline 11.59pm, Tuesday

26th March) and worth 5% of your submission. This brief bullet-pointed work plan should set

out the form your analytical approach is expected to take, including a list of the features that

you expect to use/engineer in order use to describe customers (whether RFM features,

product-based spends, temporal features, etc.), your proposed technical strategy for doing the

segmentation and brief details of how you intend to implement it (Note that this work plan is

not binding, and your final report may differ from it).

3. Customer Data Provided

The company have asked you to directly use a sample of 3000 customers’ behavioural data,

as recorded by loyalty card transaction logs. You must use this data alone to provide the

client a summarization of their customer base in the form of 5-7 cluster archetypes -

selected in a manner and number that you feel best describes the variation in the different

types of customers the client interacts with. The files available to you are as follows:

FIle Feature Description

customers_sample.csv A summary file detailing the consumer behaviour of 3000 customers

(referenced by an anonymized but consistent “customer_id”). This

data details the total no. of “ baskets” (i.e. visits) the customer has

processed at retailers store over the 6 month period, the

“total_quantity” of items they have purchased, the

“average_quantity” per basket, the “total_spend” the customer has

made over the period, and “average_spend” per visit.

category_ spends_sample.csv This file again lists the 3000 customers in the sample, but this time

splits down their spend over the period into 20 item categories, which

represent the range of items sold across the retailer. The names of

these categories are included in the file’s header, and are selfexplanatory.

baskets_sample.csv This file details information about each individual visit made by the

3000 customer’s in the sample. Specifically it details the timestamp

for the visit ( “purchase_time”), quantity of items they purchased (

“basket_quantity”), the amount of money they spent on that visit (

“basket_spend”) and the number of different categories their

purchases cut across for that trip ( “basket_categories”)

lineitem_sample.csv, A final dataset is also available for use that breaks down each

basket into its individual product purchase ids, along with category

the item belongs to. This is provided only if you want to extend

your analysis into new features, and all customer_ids it references

correspond to those in the other data files.

Report Structure

You must provide a report that clearly describes your proposed customer segmentation of

the client’s consumer base, based on the sample of transactional data they have provided.

The client company has requested that the number of groups to be used be between 5 and

7 inclusive - but that exact number is up to yourself. The customer segmentation process

will require a stage of basic statistical analysis, a stage of feature selection, a stage of

clustering, a stage of generating segments and then a stage analysing of the results to

generate pen profiles - these must all be described (please take note of the marking scheme

to decide how much of the report your should assign to each). Note that you may use any

software you desire for your analysis, but your clustering should be undertaken in python3

for this coursework. While the report is relatively flexible in structure it is expected that you

will have the following sections included:

1. An Executive Summary: including a description of the task, a summary of

your technical approach, a summary of the data that underpins it, a summary

of the results, and a summary of the insights you have arrived at.

2. A Feature Description section: a summary of the features you have selected and/or

generated from the data to describe customers, justifying the strategy your have

taken.

3. A Customer Base Summary section: A cursory exploratory analysis of the data you’ve

received, summarizing the company’s market according to the features you have

developed.

4. A Segmentation Methodology section: A description of the clustering approach you

have taken, justifying what you think a good value of k, the number of groups to use.

5. A Results section: Having produced your clusters, in this section you must now name

and summarize them. This should be done both statistically, describing how the features

you have used vary across different clusters, and descriptively (each cluster should be

accompanied by single paragraph detailing a brief vignette or “pen profile” that donates

the customer ‘archetype’ reflected by that cluster).

6. A Summary Section: Here you will provide a summary of your results, the business

case for your clustering solution and a recommendation for two segments that you

believe will be of most importance to the company to focus attention to (including your

reasons why). Draw together any key take-home points that have dropped out your

analysis, any marketing recommendations you may have as a result of your analysis,

or indeed any suggestions you have to the client for further analysing business

recommendations for further potential analysis.

4. Marking Criteria

Your submission will be assessed based on the following mark scheme:

● Work Plan (submitted on March 26th) (5 marks)

● Executive Summary (5 marks)

● Appropriateness of Data preparation (5 marks)

● Appropriateness of Feature Selection/Engineering (20 marks)

● Description and Justification of Methodology (10 marks)

● Analysis and Description of Results (25 marks)

- Overall Customer base summary (5 marks)

- Individual Statistical Summaries of Clusters (10 marks)

- Pen Portraits of Clusters (10 marks)

● Efficacy of Insights and Recommendations (5 marks)

● Technical Implementation (20 marks)

- Functionality (15 marks)

- Clarity and Commenting (5 marks)

Overall presentation and professionalism of Report (5 marks)

4. Submission Guidelines

Your Work Plan must be submitted by 11.59pm on 26th March via Moodle, using the document

title “Student ID - Your Name - ASA Coursework 1 – Workplan”.

Your Final Report must be submitted by 11.59pm on 9th Apr via Moodle. In your submission

please attach 1. Your Final Report (Do not forget, this has a strict maximum of 8 pages or 3000

words, with no appendices); 2. Your technical implementation / code; and 3. Your results file

denoting customer ids and their attributed segment. Please put your Student ID and Your

name at the beginning of your document name.

As per University guidelines late submissions will lose 5% from their final mark per day. All

text, code and workflows will also be examined to ensure there is no repetition between

submissions. Any plagiarised work will immediately receive zero marks.

5. Final Message from the company’s Chief Data Officer

“Dear consultant - we are aware customer segmentation is somewhat subjective, so we are happy to

offer the following guidance. The segmentations you produce will be used by ourselves to explore oru

customer base in the first instance, but then subsequently for marketing, and you’re suggestions

should be guided by this usage. We also understand the job will be to decide how best to describe

each customer (i.e. feature selection). While we would be very interested in new ways you can think

of to describe our consumers’ behaviour, the following ways of describing customer behaviour might

be worth investigating:

Spend: It is often said that 20% of a customer base accounts for 80% of a company's

profits. This may or may not be true, but it does isolate the fact that some groups will produce

more revenue for the company than others, and this needs to be a part of the analysis. Hence

a customer’s “total_spend” might be included as a feature.

Frequency: We have noted that some of our customers exhibit different shopping patterns

over time. The frequency of visitation in a shopper’s patterns could well help to underpin

customer segments, so the “number_of_visits” may be worth considering.

Average Spend: We definitely want to consider the average item spend of each group

generated by the clustering process - are customers in a group purchasing many small items or

single higher cost products? This will give us an indication of whether that customer type is

focussed on low or high involvement products, and thus help shape our relationship with them.

You may wish to use a customers average_spend_per_item within your features Along the

same lines, we also think you may want to include the average_basket_spend and

average_item_count for each visit as features.

And finally, the ability to distinguish customers in each segment by the different types of

products or categories that they buy from would allow use to potentially sculpt marketing effort

to different groups based on their actual interests, preferences and shopping missions.

Of course, these are just some of the core items you might want to take into account when you design

the features you are using to input into your clustering algorithm. Please extend as you see fit. We

appreciate your efforts and understand that time constraints mean that you won’t be able to

cover everything - nonetheless we look forward to seeing your individual ‘cut’ through this data in

the time you have available for the study.” - CDO.

END OF COURSEWORK SPECIFICATION


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp