Case Study: HOSTING

Download As PDF

HOSTING (formerly hosting.com)

Managed hosting provider

Based in Denver http://www.hosting.com

Over 2,000 customers

Operates data centers in six locations in North America

 

HOSTING provides managed services on top of their own hosts as well as those hosted on Azure and AWS. Customers get cloud monitoring, security, and support from an experienced team of specialists. HOSTING also specializes in meeting HIPAA, PCI, and SOC compliance requirements with their cloud-based solution.

Success and age were pushing the limits of HOSTING’s provisioning and management system. Customers were being affected by performance and reliability issues. Support and maintenance costs were climbing, so HOSTING decided to rearchitect their proprietary system.

To accomplish their goals, the HOSTING team realized that they would need to use a better underlying messaging system. After evaluating the alternatives, they decided to use NServiceBus from Particular Software. To help bootstrap the redesign, HOSTING hired Architecting Innovation (AI), experts in distributed systems integration and architecture. Crucial to the overall redesign was an overhaul of HOSTING’s system for messaging between services, switching to eventbased messaging instead of poll-based.

The project significantly reduced the load on HOSTING’s provisioning targets and saved hundreds of maintenance person-hours per year.The architectural improvements enabled the provisioning of servers within minutes, rather than hours.

The redesign increased HOSTING’s competitive position by:

  • Increasing customer satisfaction
  • Reducing cost
  • Increasing development productivity
  • Improving manageability

The work was completed in approximately 4.5 months with HOSTING realizing a number of benefits that exceeded their expectations and that yielded a valuable competitive advantages in the managed hosting solution space.

The Challenge

HOSTING was out-growing their provisioning and management platform. The platform depended on a proprietary system for communication between software components. Although that system had served the company well during its earlier years, as HOSTING scaled up and added new capabilities, the system began to have problems:

  • The old management system consumed copious amounts of expensive IT resources.
  • Customer complaints came in for problems that should have been detected earlier.
  • Troubleshooting and resolving service incidents ate up too much staff time.
  • The search service did not always function as expected, negatively impacting users' ability to view and manage their devices and services.

HOSTING’s development team was aware of many of the technical issues that were causing those problems:

  • Intersystem communication issues caused errors to go unreported or unseen in various parts of the system. This meant that the Event Management Team was unable to proactively address issues before customers were affected.
  • Sparse logs limited the information needed during troubleshooting.
  • System component failures caused pile-ups along anything using polling and callbacks; while more robust components used a difficult-to-maintain proprietary MSMQ wrapper that allowed for very wide "edge case" failures.
  • The provisioning team manually performed CMDB updates while reconfiguring the provisioning target and various legacy systems. This caused information in various places to be inaccurate, incomplete, and in some cases non-existent.

Project Goals

HOSTING decided to re-architect their management system. The goals of the project were to:

  • Reduce network and resource utilization
  • Increase reliability and availability of critical components and communication channels.
  • Eliminate failures that had been resulting in users having to enter information or repeat operations in various places.
  • Proactively move toward new components, technologies and architectures that reduce initial development and maintenance as well as future integration efforts to require 45% less engineering time.
  • Increase customer visibility into devices and services by bringing the search service component back to full functionality.
  • Enable and empower in-house engineering staff to design and develop new components that contribute to these goals.

NServicebus VS. Mule

To accomplish their goals, the HOSTING team realized that they would need to use a better underlying messaging system. They needed to move more components off of IPC/RPC and polling protocols and onto an Enterprise Service Bus. It could be MSMQ-based, as was their current proprietary messaging system, but it had to be simpler to set up and maintain.

They considered two possible ESB products:

  1. Mule from MuleSoft
  2. NServiceBus from Particular Software

Mule is a Java-based environment that is reputedly handy in heterogeneous environments where integration is required with un-customizable commercial software. The HOSTING implementation team had expected that Mule would be lightweight and easy-to-use, but it turned out to be very complex in comparison to NServiceBus. It was also costlier.

NServiceBus
  • Simpler
  • Less costly
  • Scalable
  • Reliable

NServiceBus is a .NET-based messaging platform that facilitates development of distributed systems that are scalable, reliable and easy to maintain. The tool is the flagship offering from Particular Software, a company featured in Gartner’s "Cool Vendors in Web-Scale Platforms, 2015." NServiceBus met the requirements of the project while also being simpler to implement and less costly, so the decision was made to go with NServiceBus.

To help bootstrap the implementation of NServiceBus, HOSTING decided to go with Particular’s recommended integration partner, Architecting Innovation (AI). Distributed systems architecture and integration are among AI’s specialties, and AI had experts available with extensive NServiceBus experience.

The Solution

AI was hired to help with three aspects of the initial implementation of NServiceBus:

Integration

AI integrated NServiceBus as the primary messaging platform for HOSTING’s management system.

Roadmap

AI and HOSTING engineers and architects worked together to create a 5-year plan for future development work.

Training

AI trained HOSTING engineers on NServiceBus best practices and the initial integration.

The 4.5-month project positioned the company to get rid of a legacy component that was costing hundreds of person-hours per year to maintain.

The solution:

  1. Increased customer satisfaction
  2. Reduced costs
  3. Increased productivity
  4. Improved manageability

Integration

A key ingredient of the overall solution was an overhaul of HOSTING’s system for messaging between services, incorporating event-based messaging rather than poll-based. This resulted in a significant reduction in the load on HOSTING’s provisioning targets, enabled real-time data (by bypassing disparate caches that weren’t synced up), and positioned the company to get rid of a legacy component that was actively costing hundreds of person-hours per year to maintain. The architectural improvements enabled the provisioning of servers within minutes, rather than hours.

One of the most important services to be added to the service bus was the CMDB solution used by HOSTING. The CMDB was used to keep track of subscriber information as well as device configurations. Putting the CMDB on the bus allowed for support and provisioning personnel and HOSTING’s solutions architects to see device status in real time from one view, whereas previously they had to log into Nagios, Tivoli TEM, Cisco’s management console, vSphere, or the Azure/AWS backend. What’s more, changes to the CMDB now propagate out in real time and automatically update the data warehouse that feeds HOSTING’s customer-facing portal.

To provide high availability and deliver improved disaster recovery, AI implemented MSMQ failover clustering and a clustered distributor.

Although re-design of the management system was reducing system complexity while increasing reliability and performance, the team still wanted to include better logging and monitoring capabilities for debugging future issues. For that the team chose to implement full-scale logging with Graylog and monitoring using ScienceLogic.

Finally, to help HOSTING do an optimal job of pricing IaaS, AI recommended a product called CPQ (Configure Price Quote) from FinancialForce. The tool will help HOSTING understand costs and keep their pricing competitive.

Roadmap

Together, AI and HOSTING’s engineering and solution architects compiled a 5-year plan to move the rest of HOSTING’s system and vendor services onto the bus. Those services include:

  • Zuora — subscription management
  • CloudSense — price quotes and order management
  • SalesForce — customer relationship management
  • OpenStack — cloud platform for implementing IaaS
  • Device/Service Search using Elasticsearch

Training

At the end of the project, the AI team created and delivered a tailored, multipart training course for HOSTING’s engineering and solutions architecture departments. The training was focused on increasing enterprise integration development quality, basic service bus utilization, and various distributed systems concepts and design patterns.

Results

Here are a few of the joint project team’s achievements:

  • Increased customer satisfaction
    • Reduced server provisioning time from hours to minutes
    • Improved the customer experience for portal users
    • Boosted uptime/reliability and performance
  • Reduced costs
    • Cut the number SLA violations, saving on fines and penalties
    • Reduced the overhead imposed by the management system on provisioning targets
    • Increased engineering staff productivity by simplifying the overall architecture
    • Decreased troubleshooting time by hundreds of person-hours per year
    • Decreased staff time spent making manual updates to systems/databases
  • Increased development productivity for new capabilities
    • Made the 5-year roadmap not just a dream, but a practical possibility
    • Enabled HOSTING’s “Unified Cloud” vision for managing across multiple PaaS offerings
  • Improved manageability
    • Provided better-quality data and real-time analytics
    • Improved information on true costs for maximally-competitive pricing

The entire project was completed in approximately 4.5 months. Most of the original goals were accomplished in that timeframe. Those that weren’t, such as bringing service/device search back to full functionality, were included in the 5-year roadmap. In the end, HOSTING realized a number of benefits that exceeded their expectations and that yielded a number of competitive advantages in the managed hosting solution space.