Friday, October 28, 2011

Crowd sourcing Apple iPhone 4S power performance

Interesting easy to solve the issue...

http://m.guardian.co.uk/technology/2011/oct/28/iphone-4s-battery-apple-engineers?cat=technology&type=article

Monday, October 17, 2011

Performance Test Pattern

One of the biggest challenges in monitoring and tracking performance is getting stable and repeatable numbers.  Check out the following plot of two performance tests.  On which do you think it will be easier to spot regressions?


There are two test here.  Test 1 which looks pretty erratic and test 2 which looks pretty stable and repeatable around 3 ms.  Given this I'm pretty sure you are going to choose test 2 on which to monitor and track performance.  The test is very stable meaning the variance is very low and repeatable because it does not drift off over time.  For an example of drift check out the following:


Here you can see the results are pretty stable in that the overall variance from number to number if pretty low however the results are not very repeatable and seem to drift up over time.  For this particular test high ping time is bad.  This is also and example of "death-by-a-thousand cuts" where from test to test results look good but over long periods of time you see performance is dropping off.

So the question then comes up how do you make stable and repeatable performance tests?  The answer is to follow a test pattern like the xUnit pattern with a couple of extra steps. The pattern is the following:

  1. setup 
  2. warmup
  3. execute
  4. something most tests forget
  5. publish
  6. cleanup
  7. teardown
Notice the following additional tests - warmup, step 4, publish, cleanup.  Now let me explain these steps.   

Warmup - This step is here to allow the performance test to "warm-up" the system under test.  For example if you want to measure database queries generally you have to decide if you want hot (most likely the common case) numbers where the database has been in use for a while or cold numbers which is the state right after boot/init/etc.  By having warmup you can test both hot and cold tests by the additional or removal of this step.  An example might be selecting 10 rows from a database before doing the general select tests.

Step 4 - Ahh... the mystery.  What is step 4?  Take a quick look back at the first graph.  Any ideas?  Well the answer is VALIDATE.  Most performance tests forget to validate the results they are getting.  In the previous step of warmup we said to select 10 rows.  Did the test actually return 10 rows?  If not there is likely some error.  Be sure to check your results and dont publish them if there was an error.  Generally on performance graphs invalid results look like super high, 0, or super low numbers.

Publish - This is the act of pushing the result into your tracking infrastructure.  Performance results tend to have a lifttime of usefulness however there is always good cause to look back over time.

Cleanup - Cleanup is like teardown without exiting all layers of initialization.  Generally the role of cleanup is to get things put back in order so the test can be run again with minimal side-effects.  For cold performance results you will need to teardown.

While execute is not a new step in the performance pattern I wanted to mention it because often times in performance tests you want "stable" numbers.  This is generally achieve by running the execute step a number of times and averaging or repeating steps 3 - 6 a number of times.  While averaging is often the right answer it can sometimes hide performance issues.  Perhaps I blog on that another day.

Now that you have a solid performance test pattern go forth and create amazing results....

  Tony





Friday, September 9, 2011

Remember WPR?Ccheck out the drive for better power use at Google

A couple of posts ago I talked about Watt Per Request (WPR) how power is becoming ever more important in http://perfguy.blogspot.com/2011/05/single-most-import-performance-metric.html.  What is cool is Google just released its power consumption to the world and it gives some good insights.  In Google fashion everything was accounted for right down to the Google Street Cars.  Check out the NY times article http://www.nytimes.com/2011/09/09/technology/google-details-electricity-output-of-its-data-centers.html.  To learn more about Google power use and the industry standard metric PUE (Power Usage Effectiveness) check out http://www.google.com/about/datacenters/index.html.

Enjoy,
   Tony

Monday, July 11, 2011

Ahh... the world has evolved. No more 1TB sorts.

The 1TB sort competition has ended because winners take less than a second now. Good news is there are other competitions... http://sortbenchmark.org/

Thursday, June 30, 2011

The second most important metric - Location, Location, Location

You might think from the title "Location, location, location" this will be a post about real estate however in the new era of Cloud and Mobile Computing location is going to big a huge factor both for the design of your service as well as testing it and in particular performance.  Cloud Computing is a growing trend which is enabling all kinds of new applications.  If you want to find out more on Cloud Computing you can check our the slides from a talk I did at the Better Software Conference this year in Las Vegas here.

So why does location matter?  Location matters because of physics and no one has figured out how to out run the speed of light.  Ok... that is a little abstract so let me give an example.  Image you live in South Africa and want to deploy your cool new service in California because lots of Cloud providers have data centers there - superfast to do and cheap.  A single packet of data will take a half of a second to go from South Africa to California and back.  Thats how long it takes light to travel (roughly). Now image if your new site serves pages with images to fetch, database rows to read, etc.  Each new object you serve means the user (using a browser) in South Africa has to request data from California.  Each round trip is 0.5 seconds so a page with 10 images could take 5 to 10 seconds just is trying to initiate a fetch.  Wow... its not looking good for your service if you want people in South Africa to use it.

Some generally accepted time constrains for operations to happen are the following -
  • User interfaces should respond in around 100 millisec or less.  Human perception is around 30 millisec.
  • A user can detect a software hang in round one second and will take action at around 5 seconds to fix it.
  • The good news for the web is users are willing to wait up to 15 seconds for a page however they will likely never come back if it takes more than 30 seconds.
So now that you have seen deploying your new service in California for you South Africa users is not such a good idea what should you do?  The answer is to find zones closer to you like Europe or possibly Asia.  At the time of writing this I dont know of anyone providing Cloud resources directly in Africa however the landscape is changing quickly as demand rises.

Another example of how location matters is interactive games.  Image a multi-player games really popular on the East Coast of the US with all the game servers on the West Coast of the US.  In general a packet takes around 50 to 70 millsec to travel there an back.  This means a game can only get around 10 to 15 corrections a second.  These long latencies can show up in you shooting anther player first but for some reason you die.  Gamers hate this.

Given the growing dependence of cool applications on Cloud resources its time really start thinking about where your users are and where you services are located.  The shorter the physical distance the better.

Location also matter for legal and privacy issues however thats a whole topic onto itself. 

Sunday, May 8, 2011

The single most import performance metric - WPR

Before we dive into WPR I'd like to take a moment to write about metrics because without metrics there is nothing to measure or tune.  Metrics are the quantities you are going to measure on software and hardware.  More formally a metric is a unit of measure.  There are tons of interesting metrics like %CPU for CPU utilization, Packets Per Second, QPS (Queries per second), FPS (frames per second), RTT (round trip time), and so on.

In general metrics are thought of in two classes - Utilization and Throughput/Latency.  Utilization is a measure of how much something is used so from the previous example %CPU is a utilization metric. Throughput metrics are a measure of the rate at what things are getting done like QPS.  Latency is how long it takes for an individual piece of work to complete like RTT.  Another example of Throughput/Latency is while  Google might do millions of queries a second (Throughput) you the end user are concerned with how fast your query runs (latency).  Server software tends to tune for throughput while interactive software like mobile phone apps tune for latency.

Another term you might have heard is efficiency.  Efficiency is a measure of wasted work.  The more efficient something is the less work is wasting (ie driving around the block twice before parking is likely wasted work).  I dont list it in the metric classes above because both utilization and throughput/latency metrics can be used to derive efficiency.

In my experience throughput/latency measures are more reliable than utilization metrics.  There are lots of reasons for this like Virtual Machines and advanced CPU's tend to skew utilization but not throughput.   You can see a past post of mine that talks about skew on virtual machines here.  If there is interest I can write more on this topic.

Now back to WPR... WPR is Watts per request.  Watts is a measure of the power used.  You might have seen references to Power Performance or Power Utilization etc over the last couple of years but why does it matter so much?  Power utilization is so important these days because of portable devices and data centers.

Ten years ago most computing was under the desk and prior to that it was in a central room.  Power in the central room was interesting however important issues like the speed of computation drove engineering.  Under the desk the costs of inefficient computations (High WPR) was so spread out most people did not notice or care.  However we all have really fast processors now (and yes they can be faster) and a computer in your pocket[book].  In your pocket[book] watts = surfing/talking/dorking time and in the data center watts = heat.  Heat means you have to pay a lot for space and cooling.  The biggest costs for a data center is not the computers but rather the space and power used for cooling.

Now in simpler concepts as to why WPR is so important - lower WPR can make your phone/tablet/laptop last longer and save you money in the datacenter.

So now you might be wondering how to measure power. On Windows machines you can use "c:\windows\system32\powercfg -energy" and on Linux machines you can read sensors (lm-sensors).  The internal computer sensors can be useful however most engineers looking to drive WPR are using external measurement tools like Extech or Intech which are more accurate and have computer readouts which can be used for automation.

Little did you know that Qsort [O(n*log(n))] vs. BubbleSoft [O(n^2)] was making happier users, saving money, and making the world a little greener.  Happy power hunting.

  Tony Voellm

Sunday, May 1, 2011

Three steps to making great performant software

Over the last 20+ years I have been teaching and learning about performance and its time to return more of what I've learned to the public domain.  My knowledge is based in OS (was a Windows and IRIX kernel developer as well as the Hyper-V perf lead), DB (lead the SQL Server perf team), web apps, compilers, image processing, optimization, and much more.  I've worked at the best companies like SGI, MSFT, and now Google which has also given me a wider perspective.

So now that you have a little of my background I'm going to teach you the three steps to making performant software.  You Ready?

Step 1: Have a plan
Step 2: Instrument
Step 3: Measure and Track

Yep... thats it.  Now to put this into perspective the diet industry also has three steps

Step 1: Eat less
Step 2: Exercise more
Step 3: Keep doing 1 and 2

However simple those three steps are there is a mutil-billion dollar business out there to teach them to us.  The steps are not easy and there are a lot of nuances like "What should I eat less of", etc. The three steps to great performance is a lot like the diet steps.  There are a lot of nuances and in coming posts I'll detail them more.  For now I'll give you a quick rundown.

Step 1 is to have a plan.  This means you have an idea of why you are trying to improve the software and how you want to improve it.  You have some goal in mind.  If you have no goal then why are you performance tuning?

Step 2 is to instrument. This means you will be putting markers into the code you are measuring in such a way you can figure out how close you are to your goal.  There are lots of ways to instrument with the simplest being  a printf, performance counters, Windows ETW, etc.

Step 3 is to measure and track.  This mean with each change you make you'll measure the impact it has on your performance goals and track it overtime.  If a regression shows up you'll be ready to fix it.

I can't wait to dig into the three steps more with you...

  Tony Voellm