Monday, November 7, 2011
Sunday, October 30, 2011
Its long been known that polling is bad. It uses a ton of resources. The challenge is it trips up even great developers. Check out... http://m.guardian.co.uk/technology/2011/oct/29/iphone-4s-battery-location-services-bug?cat=technology&type=article
One way to catch this is to have a good set if resource monitoring tests. Its very likely Apple had these however its hard to catch with so many ways to configure software. This is where collecting these same resources from released devices can help (crowd sourcing test). Check out for example Microsofts SQM (aka Customer Improvement Program) data.
Should you decide to collect telemetry just remember the second addage... Bad collection is like polling.
Friday, October 28, 2011
Monday, October 17, 2011
There are two test here. Test 1 which looks pretty erratic and test 2 which looks pretty stable and repeatable around 3 ms. Given this I'm pretty sure you are going to choose test 2 on which to monitor and track performance. The test is very stable meaning the variance is very low and repeatable because it does not drift off over time. For an example of drift check out the following:
Here you can see the results are pretty stable in that the overall variance from number to number if pretty low however the results are not very repeatable and seem to drift up over time. For this particular test high ping time is bad. This is also and example of "death-by-a-thousand cuts" where from test to test results look good but over long periods of time you see performance is dropping off.
So the question then comes up how do you make stable and repeatable performance tests? The answer is to follow a test pattern like the xUnit pattern with a couple of extra steps. The pattern is the following:
- something most tests forget
Friday, September 9, 2011
Monday, July 11, 2011
Thursday, June 30, 2011
- User interfaces should respond in around 100 millisec or less. Human perception is around 30 millisec.
- A user can detect a software hang in round one second and will take action at around 5 seconds to fix it.
- The good news for the web is users are willing to wait up to 15 seconds for a page however they will likely never come back if it takes more than 30 seconds.
Sunday, May 8, 2011
In general metrics are thought of in two classes - Utilization and Throughput/Latency. Utilization is a measure of how much something is used so from the previous example %CPU is a utilization metric. Throughput metrics are a measure of the rate at what things are getting done like QPS. Latency is how long it takes for an individual piece of work to complete like RTT. Another example of Throughput/Latency is while Google might do millions of queries a second (Throughput) you the end user are concerned with how fast your query runs (latency). Server software tends to tune for throughput while interactive software like mobile phone apps tune for latency.
In my experience throughput/latency measures are more reliable than utilization metrics. There are lots of reasons for this like Virtual Machines and advanced CPU's tend to skew utilization but not throughput. You can see a past post of mine that talks about skew on virtual machines here. If there is interest I can write more on this topic.
Now back to WPR... WPR is Watts per request. Watts is a measure of the power used. You might have seen references to Power Performance or Power Utilization etc over the last couple of years but why does it matter so much? Power utilization is so important these days because of portable devices and data centers.
Ten years ago most computing was under the desk and prior to that it was in a central room. Power in the central room was interesting however important issues like the speed of computation drove engineering. Under the desk the costs of inefficient computations (High WPR) was so spread out most people did not notice or care. However we all have really fast processors now (and yes they can be faster) and a computer in your pocket[book]. In your pocket[book] watts = surfing/talking/dorking time and in the data center watts = heat. Heat means you have to pay a lot for space and cooling. The biggest costs for a data center is not the computers but rather the space and power used for cooling.
Now in simpler concepts as to why WPR is so important - lower WPR can make your phone/tablet/laptop last longer and save you money in the datacenter.
So now you might be wondering how to measure power. On Windows machines you can use "c:\windows\system32\powercfg -energy" and on Linux machines you can read sensors (lm-sensors). The internal computer sensors can be useful however most engineers looking to drive WPR are using external measurement tools like Extech or Intech which are more accurate and have computer readouts which can be used for automation.
Little did you know that Qsort [O(n*log(n))] vs. BubbleSoft [O(n^2)] was making happier users, saving money, and making the world a little greener. Happy power hunting.
Sunday, May 1, 2011
So now that you have a little of my background I'm going to teach you the three steps to making performant software. You Ready?
Step 1: Have a plan
Step 2: Instrument
Step 3: Measure and Track
Yep... thats it. Now to put this into perspective the diet industry also has three steps
Step 1: Eat less
Step 2: Exercise more
Step 3: Keep doing 1 and 2
However simple those three steps are there is a mutil-billion dollar business out there to teach them to us. The steps are not easy and there are a lot of nuances like "What should I eat less of", etc. The three steps to great performance is a lot like the diet steps. There are a lot of nuances and in coming posts I'll detail them more. For now I'll give you a quick rundown.
Step 1 is to have a plan. This means you have an idea of why you are trying to improve the software and how you want to improve it. You have some goal in mind. If you have no goal then why are you performance tuning?
Step 2 is to instrument. This means you will be putting markers into the code you are measuring in such a way you can figure out how close you are to your goal. There are lots of ways to instrument with the simplest being a printf, performance counters, Windows ETW, etc.
Step 3 is to measure and track. This mean with each change you make you'll measure the impact it has on your performance goals and track it overtime. If a regression shows up you'll be ready to fix it.
I can't wait to dig into the three steps more with you...
Monday, April 25, 2011
Why is it then that email clients like Outlook don't accept both ";" or "," to separate names? To Gmail's credit while it prefers "," it accepts ";" without issue. It seems like such a simple usability improvement for Outlook and Windows Phone 7. If anyone knows the history behind the choice of ";" vs. "," and why or why not to accept both I'd love to hear it.
As for "/" and "\" for directory naming we could go on for ages on this topic. Fortunately most users never have to worry about this now because directory access is abstracted via GUI's.
This post was really intended to just help you think about the little things... they really matter!
Wednesday, March 2, 2011
Before I go too far I should let you know I am definitely biased toward Hyper-V after being the Performance Lead for three releases. You can check out my Hyper-V only blog on http://blogs.msdn.com/tvoellm. However now that I am no longer with Microsoft I'll do my best to give some good balanced insights.
While ESX has been around longer than Hyper-V I dont think you should use this to determine how fast for functional one is over the other. For example is MySpace better than Facebook? What you will likely see from a more mature product will be higher reliability just because the engineers for the mature product has had more time to shake out bugs. I can't really speak to ESX reliability however I can save Hyper-V runs on everything from 1 processor up to 64 processors and we literally tested it day and night for thousands of hours of up time fault free. I think it is more than ready for your mission critical applications.
Now back to performance. First you need to understand there are a couple of types of virtualization. You can get all the details on http://en.wikipedia.org/wiki/Virtualization however for purposes of this article you just need to understand Hyper-V and ESX virtualize the CPU, network, storage, and graphics using hardware support, emulation and binary patching. Hardware support means hardware vendors like Intel/AMD have added capabilities to their hardware that allows certain operations like page resolution or packet routing to a Virtual Machine (VM) to be done in hardware (hardware helps because it does not require expensive switches into the hypervisor). Emulation means the CPU/etc instructions to be executed are being emulated in software rather than running on real hardware (you might wonder how this can be faster - it can be when there is a lot of events in the hardware that keep engaging the hypervisors). Last there is binary patching where the original software being run in a VM is changed in some why - only ESX does this (binary patching is a useful technique because it allows for more direct control of what a piece of software - the virtualized operating system in this case - should do rather ahead of time rather than trying to determine it when an event happens).
Now that you have the basics of vitualization you can begin to understand some of the questions to ask and you might also realize that there is a lot more than the Hyper-V and ESX bits that determine how performant your virtualized workloads will be.
Really there is more than just Hyper-V and ESX to worry about? Yes. For example Hyper-V and ESX both support special instructions in the CPU's to support things like second-level address translation, etc however that does not mean the CPU you have supports that function. For example CPU's that are three years old likely do not support second level address translation which is a key feature to making VM's run fast expecially VM's running memory intensive operations. So the first question you should ask when looking to virtualize is "What machine should I buy and in particular what CPU does it have?". The simple answer is to look for a Intel Nehalem based processor such as the Xeon 5500+ or Core i7+ and for AMD look for recent Opteron or Phenom II processors. You can see more on Intel virtualization here and AMD virutalization here. I can't stress enough how much the CPU virtualization features have an impact. Both Hyper-V and ESX make good use of the CPU features and none really has an advantage over the other. Choose your CPU with the workload you want to run in mind.
This leads us to question #2 - What do the workloads look like that you are planning to run? The reason the workload matters so much is how Hyper-V and ESX virtualize networking and storage. In for example if you have a fully cached web server you want to run in a virtual machine its very likely that Hyper-V will run better because its networking virtualization is better although ESX is catching up. If however if you are running a database it may be ESX will be better because it has more support from big hardware vendors like EMC and NetApp to improve storage performance. As for 3D graphics I dont have a clear winner for you. If you are a Microsoft shop mostly you should go with Hyper-V because of the deep integration with Terminal Services. Your workloads is largely determined on how it uses the network, storage and graphics primarily. More use more a more intensive workload. For example databases are storage intensive, web servers are generally network intensive, and simulations like Weather modeling are CPU and Graphics intensive.
So on to question #3 - What is your storage environment? The environment not only includes the host machine where the virtual machines will be running but also the storage infrastructure. For example will you be running a SAN or iSCSI storage network? If you want iSCSI to a VM then Hyper-V will likely be better because its networking performance is better overall however if you run a SAN than ESX might be the better choice. There are also other questions to ask around storage like LUN provisioning, snapshotting, and migration (moving storage between host machines). The deeper the level on integration of the solution the more performant it is likely to be. Hyper-v has great basic I/O performance however I've seen more integration of VMWare with storage solutions.
Given we touched on storage we need to cover the importance of networking. It would be worth asking what virtualization networking features does your NIC support? Believe it or not Intel and Broadcom have both adopted certain features like VMQ (aka Netqueue), TCP offload (checksum and large send), Jumbo Frames, and RDMA. Hyper-V has traditionally been ahead here but there has been some leap frogging.
Another important performance dimension is power so the question is what power management features does your virtualization solution support? This is important because lots of power use means lots of heat and lots of heat means lots of cost for cooling. Both Hyper-V and ESX have power management features however at the time of this article VMWare is a bit ahead on this front. Overall virtualization with either solution will save power because of true hardware to virtual machine consolidation. What I am taking about is once you have virtualized which solution will use less power per VM operation. They are both very competitive here.
Last but not least what virtualization features do Hyper-V and VMWare support. For Hyper-V you can see here and for ESX see here.
So to recap the questions:
#1 - What machine should I buy and in particular what CPU does it have? Intel here and AMD here
#2 - What do the workloads look like that you are planning to run?
#3 - What is your storage environment? Check out the NetApp and EMC sites.
#4 - What virtualization networking features does your NIC support?
#5 - What power management features does your virtualization solution support?
There are many many more questions you could ask however the real purpose of this article was to help you understand that asking "Which is faster Windows Hyper-V or VMWare ESX?" is not such an easy question to answer and to arm you with some question you should ask.
In the end my recommendation is to try before you buy. You can ask all the questions you want however in the end you need to make a decision. My suggestion is to borrow an environment if you can and try your workloads on it. Whichever is better for you go for it (PS.... dont forget the cost).