联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2018-12-18 09:01

Final Report:

SALES OF ORTHOPEDIC EQUIPMENT

The objective of this study is to find ways to increase sales of orthopedic products from our

company to all hospitals in the United States. Find those who have high consumption of such

equipment but where our sales are low. Come up with a selected group where you think our efforts

will be rewarded.

The following description of the dataset includes variable names and some summaries of

variable.

A file with a shell SAS program that follows the analysis steps is provided in another link.

DATASET ORTHOPEDIC

VARIABLES:

ZIP : US POSTAL CODE

HID : HOSPITAL ID

CITY : CITY NAME

STATE : STATE NAME

BEDS : NUMBER OF HOSPITAL BEDS

RBEDS : NUMBER OF REHAB BEDS

OUT-V : NUMBER OF OUTPATIENT VISITS

ADM : ADMINISTRATIVE COST(In $1000's per year)

SIR : REVENUE FROM INPATIENT

SALES : SALES OF REHAB. EQUIP in $1000's per year

HIP : NUMBER OF HIP OPERATIONS

KNEE : NUMBER OF KNEE OPERATIONS

TH : TEACHING HOSPITAL 0, 1

TRAUMA : DO THEY HAVE A TRAUMA UNIT? 0, 1

REHAB : DO THEY HAVE A REHAB UNIT? 0, 1

HIP2 : NUMBER HIP OPERATIONS Year 2

KNEE2 : NUMBER KNEE OPERATIONS Year 2

FEMUR2 : NUMBER FEMUR OPERATIONS Year 2

Overview of the Analysis

Part 1. Select your market segment-s.

1. Select cases:

Select a group of states for the study (it is enough to select about 2000-2500

hospitals at random or by region). Set the zero values on SALES to missing

values.

2. Transformations:

Look at each individual variable and decide "if and which" transformation is

appropriate. Some transformations are log(1+c*x) where the constant c changes

from variable to variable ( 0.1,0.01,0.001,…) or sqrt transformation or any other.

3. Dimension reduction.

i) Separate the variables into the following groups:

Response: SALES, SALES=0 => SALES=NA

Another alternative approach but not so important here: SALESCAT = 1:0-

median 2: median-80% 3:80%-100%.

Demographics: BEDS, RBEDS, OUTV, ADM, SIR, TH, TRAUMA, REHAB

Operation numbers: HIP, KNEE, HIP2, KNEE2, FEMUR2

Typical transformations should be of the type below but not exactly, so you need

to try several possibilities for each variable untul the histogram looks acceptable.

HIP = sqrt(HIP) or SALES = log(1+0.1*SALES)

ii) Use the factor method to summarize the demographic variables and the

operation variables and come out with a final reduced list of factor variables

(perhaps 3 or 4). Use the rotated factors in order to find a good interpretation of

the factors and try to make a good story.

4. Market segmentations.

i) Independent variables are used to divide the list of hospitals (all possible

clients = the market) into subsets which we call market segments or

clusters.

Use cluster analysis to find the market segments or clusters. Since we are

summarizing the variables with factors then use the factors. One way of

choosing the number of clusters is to move the data into R and apply the

silhouette function with pam to calculate the silhouette statistic and of

cluster it to decide the number clusters. Then move the cluster variable

back to SAS if you prefer.

iii) Once the clusters are chosen we must study the summary statistics for

each cluster and try to describe their content. Interpretation is very

important at this stage. You do a boxplot of SALES or transformed SALES

VS CLUSTER_NUMBER and choose clusters with the highest SALES and

focus on the top cluster or clusters.

v) Finally we select the cluster or clusters that agree with our objectives.

These are clusters with high sales and with good characteristics, such as

high number of operations, etc.

In this study you are looking for segments with over all high sales but

where there are hospitals were the company's sales is NA so they are not

yet our customers. Some segments will have mostly low sales. This means

that those hospitals have few patients who would need our products so we

are not interested in them.

Part 2. Estimating potential gain in sales. Potential gain in sales is the difference

between current sales and the average of sales to similar hospitals. If you are

analyzing a very small cluster (N <20) then we might assume that the sales are

homogeneous and the “average sales to similar hospitals” is just the average sale to

that cluster. But if the cluster is larger we will need to obtain a regression estimate.

This is the procedure:

i) Do a regression for each of the t selected segments. Notice that since the

segments are very homogeneous you may expect that the R-square may

not be very high SO DO NOT BE CONCERNED WITH LOW RSQUARES.

ii) The hospitals with large negative residuals are the ones that have low

sales but their characteristics suggest that they are below their potential

sales (use predicted values as potential sales). Make a list of the hospitals

in your segment were sales can be improved.

iii) Give your estimate of the potential gains.

EXTRA CREDIT: All these parts are required to be performed using SAS. In

addition you could compare the results from SAS with a similar robust analysis

using R. The R analysis would apply the methods for robust clustering (pam) and

for classification and regression trees (rpart).

PAM: compare the clusters given by PAM with those from SAS, are they similar?

RPART: The idea here is to take the SALES variable that was defined earlier as

a response. Run the tree method and select one good node that have very high

sales and find hospitals on that group that have SALES=NA and estimate a

potential sale gain.

Use the rpart package in R. The rpart function is similar to lm in the sense that it

accepts “predict” for new data.


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp