How not to upsell a customer.

Upselling existing customers is a required core competency in any Sales organization.  Below is an email I received from my Account Representative at a large software as a service company which I think highlights how not to upsell.upselling

“Hi Ashish,

As your account executive, I’m writing to tell you that August is my last month at ***** as I’m taking a 1 year leave of absence to travel the world. I’m telling you this because my departure could benefit your company. If you’ve ever thought about increasing your use of ***** by adding **** or upgrading to a higher ****, now is a great time to explore this option with me. With August being my last month, I’m highly motivated to seek the best possible deal from my management on your behalf. Since my management team evaluates on a case-by-case basis when offering discounts and other incentives, the best next step would be for us to have a quick discussion about your account to see what I might be able to offer. Please let me know if this is worth discussing.

Regardless of whether you’re interested in my offer, I wish you all the best with **** and your business as a whole and I look forward to my return around this time in 2011.”

Following were some of my thoughts when I read the above.

  • There is no mention of the new Account Representative that the account will be transitioning to.
  • The departure is spun as a positive and will benefit my company. That was an interesting twist.
  • Now that the account rep is leaving they are highly motivated to get me the best possible deal.  And previously they were not?
  • Yes I get it. The primary focus is to have a great last month of bookings before the 1 year world vacation.
  • There is no personalization as to why I should upgrade?  What value will be provided?  The only argument is cost based.  “You can get a discount, so spend more” is the rationale.

All one can do is shake their head and move on.  Which I have. I am no longer a customer of the above mentioned service.

Cool to be on the This Week In Cloud Computing show

This Week in Cloud Computing_ Interview & Demo with Ringio President & CTO Ashish Soni

This Week in Cloud Computing: Interview & Demo with Ringio President & CTO Ashish Soni

I just did a live screencast interview and demo of the Ringio Product on the ThisWeekIn Webtv network specifically on the cloud computing channel with Amanda Coolong.

One of the takeways I had was that the idea of Ringio being an intelligent google voice for business resonates.  People get it.

The broader pattern in general is that when describing anything new – it is easier to describe it relative to something else that is already reasonably well understood.

For techies, this is the same paradigm that is used in Relative Estimation in Agile Development

But I digress….Thanks to Amanda for being a great host. You can see the screencast here.

Cloudonomics: High Availability should be a part of your Cloud Computing ROI calculation

High Availability In the Cloud

High Availability In the Cloud

(Editors Note : This has been cross posted from the Ringio Blog)

The typical factors considered when evaluating the ROI of the cloud compared with traditional data centers are:

  • Machine utilization, Elastic Demand and Auto Scaling. Most services do not need all servers all the time.  The cloud allows you to scale up and down and reduce cost at low demand times.  This is especially well described in Joe Weinman’s cloudonomics blog where he states that “even if cloud services cost, say, twice as much, a pure cloud solution makes sense for those demand curves where the peak-to-average ratio is two-to-one or higher.”  In a traditional setting you would have to be provisioned for peak demand all the time.
  • Power. In physical data centers the cumulative power cost outweighs the machine cost somewhere in year 2 or 3 onward.  In the cloud – the unit machine cost includes the power cost.
  • Human Resources to run the data center. Due to no physical work such as cabling etc the expected personnel count required to support data center operations is less in the cloud.

However, in addition to the above mentioned factors, one also needs to look at the out of the box high availability tool set that is provided by the cloud in order for a more thorough analysis.

The cloud provides the following benefits in the High Availability arena at a fraction of the costs in a traditional data center setting.

  • Geographic Redundancy and Diversity
    • The Cloud Way: The cloud out of the box gives you the power to instantiate your services in geographically diverse regions.
    • The Traditional Data Center: To do the same with your own data centers is very costly / time consuming / resource intensive.  In addition you need to pay for inter site connections and point to point links.  This can cost in the hundreds of thousands dollars extra.  Now you need to instantiate 2x the peak demand in any one data center assuming that any one data center can take over all of your traffic.
  • Point In Time Data Snapshots
    • The Cloud Way: The cloud gives you block storage that allows for snapshotting which allows for data protection and point in time recovery.
    • The Traditional Data Center: The same capabilities can cost tens of thousand of dollars or more when being implemented at local data centers by the way of SAN’s and NAS’s.
  • Shared Data across Multiple Geographic Regions
    • The Cloud Way: With a highly available Network Attached Like Storage your data is available in multiple regions.  In additions block data can also be snapshotted and made available across multiple regions
    • The Traditional Data Center: You need to work with ISP’s about bandwidth and Point  To Point(P2P) connections and then worry about the availability of these P2P connections.  You need to purchase SAN’s and NAS’s at each site for the data to be replicated.
  • Load Balancing & Basic Monitoring
    • The Cloud Way: The cloud provides this at a basic level and gets you off the ground quickly.
    • The Traditional Data Center: You don’t need to implement your own solution unless you really need to do some advanced loadbalancing for example.

One of the key disruptions of cloud computing has been the commoditization of High Availability offerings, some of which are mentioned above.  This is a boon for all startups like Ringio and for any existing enterprise that is looking at the cloud and is also looking at high availability.

In my view when comparing the two approaches it’s not just the cost to run a service but the cost to run a service in a highly available manner where the cloud truly shines.

photo credit – calle vieja blog.

Launch day is here!

There has been a reason why I have been missing in action over the last 2 months.  The reason has been that I along with my partners, Sam Aparicio and Michael Zirngibl  have been overdrive in getting ready for the launch of our new venture –

Well launch day is finally here today! As the press releases are beginning to hit the wires we are seeing traffic and users increase.  The excitement is palpable.  How will the business do?  How will the system do?  How is the cloud behaving?  What feedback are we going to get?  These questions and many more are racing through our heads.

Much more to come in terms of an insider’s perspective on the launch… Stay Tuned!

High Availability Principle : Request Queueing

In my earlier post about concurrency control I mentioned that due to the exponential characteristic of response time with increasing concurrency it is better to potentially reject the extra requests than letting them negatively affect your system.  While rejecting these requests is an option it is not the most desirable option. Request Queuing offers a much more powerful solution as explained below.

Request Queueing Example

For simplicities sake lets assume the following base response time characteristics for a system with no concurrency controls or request queueing:

  • at a concurrency of 500 or below the average response times are 1 second.
  • at a concurrency of 600 the average  response times are 2 seconds for all requests.
  • at a concurrency of 700 the response times are 4 seconds for all requests.

The exponential degradation starts after a concurrency of 500.  With the above numbers in mind lets assume that you have set up your concurrency controls optimally such that 500 concurrent requests can get through and the next 1000 requests (requests 501-1500) are set up to wait.  That is : The Max Queue Size is set to 1000.

Now lets say you get a sudden traffic spike of 1501 requests. Following is what will happen

  1. The first 500 requests will get through and be served on average in 1 second.  The next 1000 requests(requests 501-1500) are queued .  The 1501st request is rejected.
  2. As requests 1-500 finish the requests 501-1000 are sent forward.  So requests 501-1000 will be served in 2 seconds total.  (1 second execution time and 1 second queue time)
  3. As requests 501-1000 finish, requests 1001-1500 are forwarded on.  So requests 1001-1500 will be served in 3 seconds total.  (1 second execution time and 2 second queue time)

Characteristics of Systems With Request Queueing

  1. Request Queuing allows your system to operate at optimal throughput.  In the above example : the optimal throughput was at 500 concurrency.  The concurrency at which optimal throughput is achieved is usually right below where exponential degradation starts to take place.  At all times the system was operating at this optimal throughput
  2. Your users only experience linear degradation versus exponential degradation. As shown in the diagram, with no request queueing your users and your system would have experienced exponential degradation after 500 requests.  Requests 0-500 take 1 second, 501-1000 takes 2 seconds, 1001-1500 take 3 seconds and so on – With Request queueing – the response times become linear
  3. Your system experiences NO degradation – This is worth repeating.  The system is always operating at an optimal throughput.  The only attribute that is dynamic is the queue size.  The system remains in the green zone as highlighted in the diagram.

If even during times of unusually high load your system can show the above 3 characteristics you are in good shape.

Haproxy is one solution that can do both concurrency control and request queueing and is one of the solutions that should be considered for high availability.

High Availability Principle : Concurrency Control

One important high availability principle is concurrency control.  The idea is to allow only that much traffic through to your system which your system can handle successfully.  For example: if your system is certified to handle a concurrency of 100 then the 101st request should either timeout, be asked to try later  or wait until one of the previous 100 requests finish.  The 101st request should not be allowed to negatively impact the experience of the other 100 users.  Only the 101st request should be impacted.

Why Concurrency Control? :

Most systems exhibit Response Time patterns that rise exponentially after some critical concurrency limit(Escape concurrency) for the system has been reached- as shown in the diagram.  With concurrency control you can protect the system from entering the Critical Zone where the exponential response time behavior is experienced.

Benefits of Concurrency Control :

  1. Avoiding Cascading failures. Imagine a sudden surge in traffic which goes above your system limits.  Lets say that your database becomes the bottleneck.  As the database slows down threads on the Application Servers will begin to pile up.  Now the layer above your Application Servers your web servers/load balancers) will start to pile up threads as well.  Very soon your system reaches a state where every layer of the architecture has been compromised – the system has reached its escape concurrency – you have lost control of the system.   By controlling the amount of traffic that you take in you avoid this situation.
  2. Flexibility. The ability to control the amount of traffic that your system receives is powerful.  For example : After a release you realize that in production the scalability profile of your system has changed for the worse.  You were initially certified for a certain concurrency but all metrics are pointing to the fact that the system will not handle this load.  Now you can reduce the concurrency allowed.  While this is not an ideal situation it is much preferred to a system meltdown.
  3. Guarantee of Service Quality. One can be more confident in the Quality of Service being provided to the requests that are allowed through.

One Solution

Haproxy is good place to start to investigate.  It offers a myriad of features such as load balancing, content switching in addition to concurrency control.  Concurrency limits can be set at a global level and/or at a server level.  Importantly it also offers request queuing.  This is extremely powerful when used in concert with concurrency control.  More on this in a later post.  I recommend that you look at haproxy when designing high availability systems.

Concurrency controls should be put in place in concert with a capacity planning program.  If indeed system concurrency limits are being approached one better start to scale the system and then increase the concurrency allowed.

The Organizational Impact of Poor Software Quality

Organization Impact of Poor Software Quality

Organization Impact of Poor Software Quality

Most Companies look at Software Quality through one of the following 2 lenses

  1. A Technology Executive may look at software quality so as to determine whether the software is in a quality state to be released or not?
  2. A COO may look at Software Quality from the angle of customer churn.  What percentage of customer churn was caused due to poor software quality?

Both of the above views are indeed critical.  However one angle that typically does not get attention is the organizational impact of software quality.  As shown in the diagram poor quality can have a reverberating impact across the whole organization.

The diagram highlights :

  1. Poor Quality in production causes the the whole company to be interrupt driven.  As and when the customers experience poor quality the whole organization experiences the pain.  You lose control of your schedule.  You are on the customers bug discovery schedule. This is the main reason for schedule slips of currently ongoing projects.  People lose focus on the task at hand and are thrust into a production patches or production crisis resolution.
  2. Poor Quality effects each and every department. The diagram shows how Customer Support, Sales, Quality Engineering, Engineering, Operations, Product Management are all effected.  Each one of these departments is interrupted from their current focus.
  3. Poor quality leads to an order of magnitude more work for the organization. If an issue could have been found/resolved within the Quality/Engineering Teams at most this would have been a 2 or 3 step process encompassing 1-2 groups.  However once the issue is exposed to the customer this becomes a 13 step process (at bestassuming that every one of the 13 steps goes without a hitch) encompassing the whole organization.

My Recommendation (2 Critical Metrics):

  1. Number of patches required after release.  Each time a patch is released the whole organization goes through hoops as is highlighted in the diagram. If this is not brought down to Zero (or close to it) you will be falling behind on your next release.  You will be in a vicious cycle.  Start to measure this and bring it down.  This is one of my most important metrics when measuring the job that quality does.
  2. Opportunity cost per Patch. What is the man week cost of a patch on average for the organization?   This will give an idea of the scale of the problem.  Bring this down with automation.

By measuring the above metrics you can start to plan better by taking into account your historical patch cycle times and patch frequencies.  Secondly this can help in highlighting to non technology executives the importance of quality across the organization.


Get every new post delivered to your Inbox.