Need Help in Business Analytics

In a single Word document, Chapter 9 Case Problem: “Grey Code
Corporation.” If using Excel or Minitab for your calculations, charts,
and graphs, please copy and paste your work into the Word document. Do
not attach Excel or Minitab as separate documents.
Grey Code
Corporation (GCC) is a media and marketing company involved in magazine
and book publishing and in television broadcasting. GCC’s portfolio of
home and family magazines has been a long-running strength, but it has
expanded to become a provider of a spectrum of services (market
research, communications planning, web site advertising, etc.) that can
enhance its clients’ brands.
GCC’s relational database contains
over a terabyte of data encompassing 75 million customers. GCC uses the
data in its database to develop campaigns for new customer acquisition,
customer reactivation, and identification of cross-selling opportunities
for products. For example, GCC will generate separate versions of a
monthly issue of a magazine that will differ only by the advertisements
they contain. It will mail a subscribing customer the version with the
print ads identified by its database as being of most interest to that

particular problem facing GCC is how to boost the customer response
rate to renewal offers that it mails to its magazine subscribers. The
industry response rate is about 2%, but GCC has historically performed
better than that. However, GCC must update its model to correspond to
recent changes. GCC’s director of database marketing, Chris Grey, wants
to make sure that GCC maintains its place as one of the top achievers in
targeted marketing. The file Grey contains 38 variables
(columns) and over 40,000 rows (distinct customers). The table appended
to the end of this case provides a list of the variables and their
Play the role of Chris Grey and construct a classification model to identify customers who are likely to respond to a
mailing. Write a report that documents the following steps:Your
report should include appropriate charts (ROC curves, lift charts,
etc.) and include a recommendation on how to apply the results of your
proposed model. For example, if GCC sends the targeted marketing to the
top 10% of the test set that the model believes is most likely to renew,
what is the expected response rate? How does that compare to the
industry’s average response rate
VariableDescriptionCustomerIDCustomer identification numberRenewal1 if customer renewed magazine in response to mailing, 0 otherwiseAgeCustomer age (ranges from 18 to 99)HomeOwnerLikelihood of customer owning their own homeResidenceLengthNumber of years customer has lived at current residence. Values: , , , , , , , , , , , , , , or moreDwellingTypeIdentifies the type of residence. . , Gender, , Marital, , (divorced, widowed, etc.), HouseholdSizeIdentifies the number of individuals in the household. Arguments are: , , , , , ChildPresentIndicates if children are present in the home. ; ; Child0-5Likelihood of child 0–5 years old present in homeChild6-12Likelihood of child 6–12 years old present in homeChild13-18Likelihood of child 13–18 years old present in homeIncomeEstimated income. Ranges from $5,000 to $500,000 OccupationBroad aggregation of occupations into high level categories. Arguments are: (blue collar type jobs), (caregivers, unemployed, homemakers), HomeValueThe estimated home value in ranges. Arguments are MagazineStatusIdentifies the status for a customer based on their magazine business activity.PaidDirectMailOrdersNumber of paid direct mail orders across all magazine subscriptionsYearsSinceLastOrderYears since last order across all business linesTotalAmountPaidTotal dollar amount paid for all magazine subscriptions over timeDollarsPerIssuePaid Amount/Number of Issues Served. Average value per issue (takes the subscription term into account)TotalPaidOrdersTotal # of paid orders across all magazine subscriptionsMonthsSinceLastPaymentRecency – # months since most recent paymentLastPaymentTypeIndicates
how the customer paid on the most recent order. If it was credit order
it will contain the billing effort number (how many bills were sent to
collect payment).on ith billingUnpaidMagazinesNumber of magazine titles currently in “unpaid” status for a given magazine customerPaidCashMagazinesNumber of magazine titles currently in “paid cash” status for a given magazine customerPaidReinstateMagazinesNumber of magazine titles currently in “paid reinstate” status for a given magazine customerPaidCreditMagazinesNumber of magazine titles currently in “paid credit” status for a given magazine customerActiveSubscriptionsNumber of different magazines the customer is in “Active” statusExpiredSubscriptionsNumber of different magazines the customer is in “Expire” statusRequestedCancellationsNumber of different magazines the customer is in “Cancelled via Customer Request” statusNoPayCancellationsNumber of different magazines the customer is in “Cancelled for non-payment” statusPaidComplaintsNumber of different magazines the customer is in “Paid Complaint” statusGiftDonorYes/No indicator as to whether the customer has given a magazine subscription as a giftNumberGiftDonationsNumber of subscription gift orders for this customerMonthsSince1stOrderRecency (in months) of 1st order for this magazineMonthsSinceLastOrderRecency (in months) of most recent order for this magazineMonthsSinceExpireRecency
(in months) since the customer’s subscription has expired for this
magazine. Negative values represent months until an active subscription
expiresExplore the data. Because of the large number of variables, it may be helpful to filter out unnecessary and redundant variables.
partition the data set into training, validation, and test sets.
Experiment with various classification methods and propose a final model
for identifying customers who will respond to the targeted marketing