Friday, May 2, 2014

Determining which data plan is best for you

It can be difficult finding a data plan that best fits your personal usage. If you get one that’s too large, you waste money each month paying for unused data. Too small and the charges for additional bandwidth quickly add up. If a carrier offers few to no choices, it’s probably not that difficult to determine which data plan provides you with the best value, especially if your data usage is relatively constant each month. However, when there are a number of choices, together with sharing between family members and devices, the decision becomes less straight forward.

WARNING: Serious Geekery follows. Proceed at your own risk.


Because NTT Docomo is radically changing their data plan offerings this June, there is serious potential to save money, especially if you are currently paying for multiple people and/or devices. In this post, I’ll describe how I evaluated which plan was most likely to cost me the least amount of money over the next two years. You can easily get a rough idea by just looking at your past data consumption and calculating the monthly cost for any particular data plan after factoring in overage charges. However, I want to consider how longterm cost is affected by variable monthly usage. I also want to consider the variability in data consumption among multiple devices all drawing from the same shared data plan.

So I created a framework that can scale across multiple scenarios. I’ll describe the most simple scenario below, which is a single person with one device. Examples of more complicated scenarios are a phone and a data-only tablet sharing a 2 GB or 5 GB data plan, a family sharing a 10+ GB data plan, or a small business sharing a large corporate data plan.

To do this requires:
  • Knowledge of your past usage
  • Informed assumptions on future usage
  • A large number of simulated potential futures
  • Some form of decision making regarding purchasing additional data
The actual question I will ask is this: Which data plan for me (the 2 GB, 5 GB, or 7 GB) is most likely to result in the lowest total cost over two years?

Past usage

Unfortunately for me, billing went paperless (if you pay by credit card) in January 2013, and the my docomo page only shows the usage for the past 3 months. (My phone only also stores 3 months of mobile data consumtion.) This means that I don’t have information regarding my full history of LTE data usage, so I’ll just have to make do with what I have. Below is my monthly data consumption shown as a time series on the left and as a histogram on the right.

I could use this “as is” as a predictor of future usage because I don’t really expect there to be many changes. There will be months when I use a lot of data, and there will be months when I don’t. However, I recall that I used nearly all of my current 7 GB quota at least once in the past, which must have happened during the period for which I have no information. So I will need to factor this into to my prediction of future usage.

Probable future monthly usage

I found it easiest to create a probability density function (PDF) based on the characteristics of my actual usage but including two additional months of 7 GB usage, which I have set as the maximum.

The PDF above describes the probability of using any particular amount of data during any given month. There is zero probability of using zero data. This is because I have not been outside of Japan for an entire billing cycle. Probability rapidly climbs to just over a 20% chance of using between 3 and 4 GB during a month. Probability then declines slowly to less than a 5% chance of consuming 7 GB.

I can generate potential future data consumption values from this PDF. I can randomly draw 24 values and consider it a sort of prediction of the next two years, but this will represent only one particular outcome. What I really need to do is to simulate thousands of potential futures. The mean value will of course converge on the mean of the PDF but I’m not concerned with mean — I can figure that out without simulating anything. I want to know about the potential for extreme values, which directly determines the cost of overages (or the amount of data I waste each month with a plan that is bigger than I need).

Estimating monthly cost from predicted usage

Once you exceed your quota, you have two choices: 1) live with 128 kbps until the end of the month, or 2) buy more high-speed data. Decision making regarding this is driven by a number of factors, the most important of which is probably how close you are to the end of month. I think most people would be reluctant to buy 1 GB of data when there is only one day left in the month. This of course could be overridden if a certain situation (e.g., job related) demanded a high-speed mobile connection RIGHT NOW.

I could attempt to simulate the decision making process around whether or not to buy more bandwidth, but it would really just be a bunch of BS that I pulled pretty much out of thin air.

Instead, I will set it up so that whole GB amounts always result in the purchase of additional bandwidth. Fractional amounts only result in an additional purchase if there would be more than 4 days remaining in the billing cycle (I did pull 4 days out of thin air). The number of days in the month remaining was estimated by uniformly distributing the total monthly usage amount across the entire month, i.e., daily usage is the total divided by the 30 days. This is best explained with an example.

Let’s consider that a value of 5.25 GB is drawn for a particular month. This is 3.25 GB in excess of the 2GB quota. The first 3 GB generates ¥3,000 additional data charges. If the 5.25 GB was used evenly over a 30 day month, this would be 0.175 GB per day. The remainder of 0.25 GB represents less than two days, and in this simulation, no additional data are purchased. The 0.25 GB are essentially thrown out, and the user struggles though 1.5 days at 128 kbps. The monthly cost would be ¥6,500 with the 2 GB plan and ¥5,000 with the 5 GB plan.

And the least expensive one is…

The 2 GB plan, FOR ME. This cannot be applied to anyone else unless their usage habits are exactly like mine. In the figure below, blue shows the results for the 7 GB plan, green for the 5 GB, and red for the 2 GB. The primary graph is the cumulative cost. The dots are the mean of all 100,000 simulations, and the error bars shows the standard deviation. The lower right bar graph shows the (mean) total after two years (i.e., where lines stops in the upper right).

Because I have set 7 GB as the maximum, the potential total cost of the 7 GB plan does not vary during simulation and would cost a total of ¥136,800 over two years. The cost of the 5 GB and especially the 2 GB plans is variable across simulations. From the start until 24 months, the mean cost of the 2 GB plan is always lower than the 5 GB plan, but there is a possibility, that by chance alone, the 2 GB plan could actually cost more. You can see that the uncertainty in the 2 GB plan is quite large, and if many month’s data usage was actually drawn from the upper limit in the range of values, the cost would exceed the 5 GB plan. I estimate that the chance of this happening is rather low (but not so low as to be completely discarded). As shown in the upper left, in 76% of all simulations (76,000), the 2 GB plan resulted in a lower total cost. The reverse occurred only 26% of the time.

So based on this, I would have a reasonable chance of saving a modest amount of money over the next 2 years with the 2 GB plan, especially when you consider that I am free to modify my behavior if it seems I may exceed my quota.

Scaling to a large number of devices

I worry that most of you haven’t even made it this far, and those of you who did are probably saying “wut?” I can expand this to consider multiple devices sharing data, adding another simulation for each device and adding up the results. For example, I also have a data-only device that currently uses a b-mobile Fair SIM. I also have a good idea of how much data that device consumes, so I can sum the total data consumed between both devices to see if it would be cheaper to cancel the Fair plan from b-mobile and get a second Docomo SIM that shares my data quota. In this case, I can already imagine that the chances of the 2 GB plan being the cheapest would be reduced, but since the Fair plan is designed for little usage (on average 250 MB/month), it still could be in my best interest to get the 2 GB plan.

If there is enough interest in seeing the results of this, I can make another post considering sharing between devices, family members, or small business employees.


  1. Boy o boy....You are using your golden week very well!!

    Impressive post. This is how I would have done it.

    Currently I contract ~6GB data for around 3000yen a month
    -> ~2500 yen for 3GB Docomo LTE data (2 year contact with discount included)
    -> ~600 yen for 3GB Emobile LTE data (2 year contact with discount included)

    Actual usage,
    -> ~ 700MB on Docomo

    -> ~ 200MB on Emobile

    But I guess that as of now, I do not have many options to get LTE high speed data at less than 500yen/GB.

  2. Yeah, for just a small amount of data, there really are no options for high speed since the MVNOs can't afford to pull out the speed bumps. Families with multiple phones, etc. will be the ones who will see the most benefit from this.

  3. Wow, someone's been bored during GW... Still.. an academically stimulating investigation.
    I would argue that the few thousand yen savings over 2 years are not worth the discomfort of having to worry about your data usage... Unless you have the moral rigidity to religiously set aside the ~tens of yen you save every month aside and eventually buy that tablet after 2 years... in which case... my hat is off to you, Sir.

  4. This is what happens on long flights with AC power on your seat and a core i7 notebook with lots of RAM :)

    If you read the next post, I think you'll where I am going with this.

    I am considering multiple people sharing one of the large data plans. This could be, for example, a family of four or tens to hundreds of employees of a small business, where optimizing monthly bandwidth cost becomes very important. (The cost of the largest business plan is 1,900,000 yen per month, ten times the total 2-year cost I'm talking about here.)

    Instead of just jumping straight into that, I decided to introduce the framework in a simple way, considering the variance in just one device.