Modeling Customer Buying Behavior for the Prepaid Cell Phone Service Provider Tracfone

A simple (attempted) application of some probabilistic modeling concepts.  The story is much more complicated than presented here and more than the model can handle.  I will continue to develop the model to address issues that led to its failure as stated in the ending discussion

Jeisun Charles Wen

The Problem: Tracking the Customer

Tracfone is the largest vendor of prepaid cellular phone service in the United States.  Unlike a conventional cellular service provider like Cingular, T-Mobile or Verizon, Tracfone sells its cellular services piecewise in the form of cellular time and days of active service.[1]  This business model poses challenges for Tracfone in keeping track of its customer base.  In particular, the non-contractual nature of the relationship between Tracfone and its customers makes it difficult to conduct calculations such as customer lifetime value vis-a-vis traditional cellular phone service providers like Cingular. 

The cause of difficulty in projecting customer purchases over a period of time is due to the unobservability of Tracfone's customers.  With a traditional cellular phone service, customers purchase a subscription style plan and pay monthly fee to use the service.  In order to cancel their plans, they must actively contact the phone service provider and ask to cancel their plan.  With Tracfone's customers, it is a different story.  Tracfone can only "observe" the event their customers pay to use their service, but they cannot track when their customers decide to stop.  When a past customer has gone a long period of time without buying more cellular time it can be due to two reasons.  Either the customer has decided to stop using Tracfone or the customer's present level of usage of his/her cell phone does not warrant the purchase of additional minutes.  In the second case, the fact that the customer has chosen to not purchase in the present period does not mean that he/she will not purchase more in the future.

To overcome these problems of unobservability, I present a stochastic consumer behavior model that seeks to estimate a person's latent purchasing habits based on frequency of purchase over a set period of time.  The model I have chosen to analyze the frequency data for online purchases of Tracfone's services is the shifted Negative Binomial Distribution or shifted NBD model.  Based on the individual behavior of interest, the shifted NBD framework matches well with the story of Tracfone's customers.

 

Reasons for Selecting the Shifted NBD

The shifted NBD model I chose arises from a mixing distribution between a Poisson and a gamma distribution.  In the story about Tracfone's customers, both the Poisson and the gamma distributions fit well with the behaviors of interest.

The Poisson possess is used to model the observed buying patterns of a cohort of customers over a period of time.  It does this by the rate parameter λ.  The Poisson fits well with the story of customer purchase frequency from three reasons.  First, the Poisson distribution is strictly positive, which fits the story that "negative" customers cannot be observed.  Second, the distribution is discrete just like how customers can only be represented by integer values.  Third, the Poisson does not have an upper bound.  In the same way, the number of purchases a customer can make over a time period under most circumstances is not limited by a maximum quantity.

The second part of the shifted NBD model is the gamma distribution that governs the aforementioned rate parameter λ.  This is to reflect the fact that the group of customers being observed is heterogeneous, that is, different people have different rates at which they purchase Tracfone's services.  This is a reasonable assumption being that people use their cell phones with varying degrees of frequency and intensity.[2]  There are four reasons why the gamma distribution is a good one for modeling the rate parameter λ.  First, it is a positive distribution, which reflects the fact that the rate being observed has no negative interpretation.  Second, the gamma distribution is continuous just like how λ does not necessarily have to be an integer.  Third, the gamma distribution, given by scale parameter α and shape parameter r, has the ability to assume a wide variety of shapes.  Since we do not know the specific shape and magnitude of the underlying distribution, the gamma distribution gives the model a lot of freedom to fit the data.  Last, the gamma distribution when mixed with the Poisson results in a closed form solution that is easy to work with.

The last part of the shifted NBD model is "shift" itself.  Shifting the NBD truncates the probability that P(X=0) and there is a reason for this.  In the customer behavior of interest, a "zero" count has no meaning in the Tracfone data.  The model takes into consideration only the group of customers that have made at least one purchase during the observation period.  Therefore, the possibility of x = 0 is excluded by design.  Truncation of the probability mass of x = 0 was accomplished by prorating the cut out mass over the remaining probabilities.

P(Y= y) = P(X = y)/(1 P(X = 0)), y= 1, 2, 3, . . .

Having sufficiently examined the basis for applying the shifted NBD model to the Tracfone data, I now move on to looking at the data set itself.

 

Tracfone Data, Treatment, and Tests

The Tracfone data comes from the comScore database available through WRDS.  Transaction records were drawn from the database of comScore's panelists for the year 2004.  Two cohorts of customers were sampled from the data.  The Cohort 1 consisted of customers who have made at least 1 purchase during the first quarter (from the months of January to March).  The Cohort 2 consisted of customers who have made at least 1 purchase during the first half of the year (from January to the end of June).  Cohort 1 is a subset of Cohort 2 by design.  Subsequent purchase data were also collected for both cohorts.

The number of purchases for Cohort 1 and 2 were histogramed over the three month and six month period, respectively.  The results are summarized by Graphs 1 and 2 below.

            GRAPH 1

            GRAPH 2

These actual counts were used to calibrate the parameters for the NBD model.  The predicted values of both models are then compared with the actual data in the next two graphs.

            GRAPH 3

            GRAPH 4

The MAPE calculated for the Cohort 1 was 6.3% while the MAPE for Cohort 2 was 16%.  This suggests that neither model is truly a good fit for the data.  Short of a reasonable case for extending the model with spike, I will stop here.  Later, I will examine the failures of the shifted NBD model, but first, let us look at the annual forecasts made by each model.  The graphs for the yearly forecasts represent the actual and predicted purchasing frequencies by the members of each respective cohort.

            GRAPH 5

            GRAPH 6

Neither model can claim to be a good fit to the actual purchasing numbers for the year by the members of each cohort.  The Cohort 1 model had an MAPE of 20.4% and the Cohort 2 model had an MAPE of 15.7%.  The percent error increased 4 fold for the Cohort 1 model and remained relatively the same for the Cohort 2 model.

Since neither model has demonstrated high efficacy in predicting the yearly purchasing of their respective cohort members, reasons as to why the discrepancy exists between the actual and predicted values must be explored.

 

Model Assumptions Vs. the Real World

Seeing the dismal fit and forecasts by both models, I began to examine potential causes for the inability to forecast with greater accuracy by scrutinizing model assumptions.  The most immediate culprit hurting forecasting power that comes to mind is non-stationarity.  One of the assumptions that the NBD model is built on when it makes projections into the future is that the underlying distribution of the rates λ does not change over the period being forecasted.  From researching Tracfone, I have come up with two reasons why stationarity may not hold for its data (there may be more).

First, Tracfone has a marketing strategy that employs the heavy use of discounts and promotions.  Customers who are familiar with this cycle of discounting will alter their purchasing habits to take advantage of these promotions.  Therefore, the rate λ changes depending on the promotion cycle.  The stationarity assumption of the NBD model does not hold well in an environment with heavy discounting going on.

Second, there are unexplored relationships between purchase frequency and value of purchase.  As mentioned earlier, Tracfone sells cellular phone time in a piecewise manner.  Customers who purchase larger chunks of time per transaction probably wait longer on average than those who purchase smaller chunks.  This property would be captured under the shifted NBD model proposed, but forecasting power will disintegrate if customers frequently switch between different time packages (perhaps due to promotion and discounts).  Since there may be a strong correlation between price of purchase and purchase frequency, stationarity may not hold up in an environment where customers often change the type of cellular phone time package they buy.

Looking at the data, there is the potential to try out different models that might provide a better fit with superior forecasting capabilities.  Maybe the counting model based on the shifted NBD is not enough to capture the entire story behind Tracfone's customers.  Some suggestions for further modeling exercises would include modeling depth of repeat and using timing models.


 

[1]In order to use Tracfone’s service, both must be purchased.  Celluar time is sold in minutes.  Purchases of days of active service are required in order to use the minutes.  If the days of active service expires, customers will no longer be able to use their Tracfones even if they still have unused minutes.

[2] Unless you consider the case where there is a very limited set of cell phones in use.  In the most extreme case with only two existing cell phones, phone usage is completely dependent.  When one person calls, the other answers, and so the number of minutes being logged on each phone must be equal.  Forgive me for not being able to stop myself from pointing this out.