Saturday, March 24, 2012

Ads can steal your power - Mobile trade-offs

In reading the following article the other day "Study: Free Android Apps Can Steal Your Phone's Power" I was reminded of all the trade-offs one has to make when designing mobile applications.  Before we dig into some of the trade-offs one does have to wonder about the purpose of one company doing a sanctioned study of another companies products. We'll leave digging into that topic for the another day.

Now back to the topic of mobile trade-offs.  Some might argue this, but the single most important thing to design for is minimal power consumption.  Power is so important on mobile platforms because users really don't want to hang out at charging pods in airports, plug in on the train, plugging in at a friends house, ... One of the big delighters of the original Kindle was it could run for weeks on a single charge.  The newer Kindle Fire lasts less than a day like most of current gen platforms.  In looking at the new iPad3 it has a battery almost twice the size without any addition device life ~10 hours.   So where is all that power going?  The two biggest draws on power are the screen and radio.  You have some control of the power consumption on the screen.  The dimmer you go the longer you go.  The iPad3 has a super dense screen and new graphics CPUs.  I wonder how many people would keep the old screen and CPUs for 20 hours of use on an iPad2? 

I have a Windows Phone 7 (yes I am willing to admit I have one - I'm a techie) and just use it as an in house wi-fi device.  What totally surprised me is the phone lasts for weeks without the radio on.  The first time I had that happen I was pretty surprised by just how much impact the radio did have.

While I've talked able the screen and radio we can't ignore use of the processor (CPU).  Most mobile devices have ARM chips which do all kinds of cool things like clock the CPU lower when not in use, have low power cores for when the phone is idle, etc.  The take-away is doing really CPU intensive work like my fractal app at http://www.tonyware.com/fractals will drain your power.  It has my phone because zoom in is so cool :)

So what are some of the trade-offs for mobile you should be thinking about?
  • Do you really need to send data to your backend server every 10sec or would once an hour be good?  Not only will this save power it will also cost your customers less of their network bandwidth.  (Rule #1 - Use the radio less)
  • Does having a white background really make the App better? Can you choose dimmer colors? If the App is idle should you dim the screen? (Rule #2 - lighting up more pixels uses more power)
  • What about pushing more computation to the server rather than having the phone do it? (Rule #3 - Dont use the CPU for big calculations)
The last point around doing work on the server rather than the phone is one of the reasons Google AppEngine is growing so quickly for mobile.  The more you can do on the server the more power will be saved.  Its also very likely the computation will go much faster on the server.  The trade-off on the developer end is how to manage cost while getting the best customer experience.  By the way computation in the Cloud are far "greener" than on any other computer device.  But that is a topic for another day.

Hope this got you thinking....

  Enjoy,
     Anthony F. Voellm (aka Tony)


Friday, March 2, 2012

Old rules of thumb always need to be reconsidered

After being in the computer industry for a while you begin to appreciate just how much machine capabilities change and the need to change designs along with them.  For example just 10 years about developers would spend hours trying to find ways to save a few bytes of memory.  Now most of the code the world runs is via interpreters (JiT, PY, PhP, Jscript, ...) and a few bytes is less interesting.  I'm not saying to waste them but I personally would not put it as my first priority.  


Let me give you an example of how changes in machine capabilities caused the rethinking of an OS.  In Windows XP Microsoft Engineers designed the memory manager to aggressively push data from main memory (RAM) to disk.  This was done because RAM was costly and very small (~128MB) at the time and so if more memory could be freed up new applications could start faster.  If you waited until an application started to free memory users would wait 30 seconds to minutes before the application was usable because of paging RAM to disk.  Between the time Windows XP and Vista shipped RAM prices dropped dramatically (from $40 for 128MB to just $2 dollars). 




With the dramatic change in memory prices and the fact disks did not really get any faster Vista fundamentally broke from the past rule of thumb of free up as much RAM as possible and push it to disk to just the reverse.  RAM was cheap and relatively plentiful so a feature called SuperFetch was created to aggressively page in data from disk to RAM.  Based on the decision to not force RAM to disk overall UI performance seemed to be more snappy in Vista.  No more shaking the mouse after lunch with XP and waiting a minute+ before logging in.


Well it looks like with the improved performance of CPU's and networks old rules of thumb around UI responsiveness are starting to be reconsidered.  Some early UI research in 1968 by Miller and 1991 by Card  lead to rules of thumb for UI regularly cited in "Response Times: The 3 important limits" and extended for the World Wide Wait, I mean Web.


Here is a recap of those rules and a few more that have been adopted from experience and very likely paper I've read long ago and forgotten:

  • Users consider 100ms response times fast
  • At around 1 seconds users will notice a delay but are tolerant
  • At 5 seconds users are starting to get impatient and may take action
  • At 10 seconds they lose focus
  • At 15 seconds they are likely to hit “refresh”
  • At 30 seconds they generally navigate away and don’t come back if there is an acceptable alternative.
Well it looks like even hard earned rules of thumb for UI and Web are now falling as seen in a recent NYTIMES article "For Impatient Web Users, an Eye Blink Is Just Too Long to Wait". Based on this article it looks like 250millisec is the new goal for web responsiveness rather than 1 second as we had all used.

The overall morale of the story is don't hold on too dear to those rules of thumb and perhaps you should rethink them often. 

  -- Anthony F. Voellm

Monday, February 27, 2012

Fix security bugs early - Interesting paper

Interesting paper - Find security bugs before they release because of the high cost to fix later.  Internet Apps change some of the cost dynamics however the that does not mean fixing early is less important because its hard to fix your reputation.

http://www.stickyminds.com/Files/Automated%20Testing%20With%20Commerical%20Fuzzing%20Tools.pdf



Monday, November 7, 2011

A look at the Fundamentals in the Cloud

If you are interested in the Cloud and testing the following is a talk I did at GTAC2011 that might be interesting to you.

Part the Clouds and See Fact from Fiction

Sunday, October 30, 2011

Old performance addage... Polling is bad

Its long been known that polling is bad.  It uses a ton of resources.  The challenge is it trips up even great developers.  Check out... http://m.guardian.co.uk/technology/2011/oct/29/iphone-4s-battery-location-services-bug?cat=technology&type=article

One way to catch this is to have a good set if resource monitoring tests.  Its very likely Apple had these however its hard to catch with so many ways to configure software.  This is where collecting these same resources from released devices can help (crowd sourcing test).  Check out for example Microsofts SQM (aka Customer Improvement Program) data.

Should you decide to collect telemetry just remember the second addage... Bad collection is like polling.

Friday, October 28, 2011

Crowd sourcing Apple iPhone 4S power performance

Interesting easy to solve the issue...

http://m.guardian.co.uk/technology/2011/oct/28/iphone-4s-battery-apple-engineers?cat=technology&type=article

Monday, October 17, 2011

Performance Test Pattern

One of the biggest challenges in monitoring and tracking performance is getting stable and repeatable numbers.  Check out the following plot of two performance tests.  On which do you think it will be easier to spot regressions?


There are two test here.  Test 1 which looks pretty erratic and test 2 which looks pretty stable and repeatable around 3 ms.  Given this I'm pretty sure you are going to choose test 2 on which to monitor and track performance.  The test is very stable meaning the variance is very low and repeatable because it does not drift off over time.  For an example of drift check out the following:


Here you can see the results are pretty stable in that the overall variance from number to number if pretty low however the results are not very repeatable and seem to drift up over time.  For this particular test high ping time is bad.  This is also and example of "death-by-a-thousand cuts" where from test to test results look good but over long periods of time you see performance is dropping off.

So the question then comes up how do you make stable and repeatable performance tests?  The answer is to follow a test pattern like the xUnit pattern with a couple of extra steps. The pattern is the following:

  1. setup 
  2. warmup
  3. execute
  4. something most tests forget
  5. publish
  6. cleanup
  7. teardown
Notice the following additional tests - warmup, step 4, publish, cleanup.  Now let me explain these steps.   

Warmup - This step is here to allow the performance test to "warm-up" the system under test.  For example if you want to measure database queries generally you have to decide if you want hot (most likely the common case) numbers where the database has been in use for a while or cold numbers which is the state right after boot/init/etc.  By having warmup you can test both hot and cold tests by the additional or removal of this step.  An example might be selecting 10 rows from a database before doing the general select tests.

Step 4 - Ahh... the mystery.  What is step 4?  Take a quick look back at the first graph.  Any ideas?  Well the answer is VALIDATE.  Most performance tests forget to validate the results they are getting.  In the previous step of warmup we said to select 10 rows.  Did the test actually return 10 rows?  If not there is likely some error.  Be sure to check your results and dont publish them if there was an error.  Generally on performance graphs invalid results look like super high, 0, or super low numbers.

Publish - This is the act of pushing the result into your tracking infrastructure.  Performance results tend to have a lifttime of usefulness however there is always good cause to look back over time.

Cleanup - Cleanup is like teardown without exiting all layers of initialization.  Generally the role of cleanup is to get things put back in order so the test can be run again with minimal side-effects.  For cold performance results you will need to teardown.

While execute is not a new step in the performance pattern I wanted to mention it because often times in performance tests you want "stable" numbers.  This is generally achieve by running the execute step a number of times and averaging or repeating steps 3 - 6 a number of times.  While averaging is often the right answer it can sometimes hide performance issues.  Perhaps I blog on that another day.

Now that you have a solid performance test pattern go forth and create amazing results....

  Tony