Preventing Disaster
with Redundancy Solutions
By David Weiss
With the
ongoing migration of voice communications toward server-based
facilities and IP networks, today’s phone system is susceptible to
all of the maladies of the network world, including hacker attacks,
viruses, and Trojan horses. While many companies have embraced
redundancy in the data center, few have recognized the need to
incorporate redundant systems into their voice technologies. As the
corporate world hinges on a constant stream of data and voice
communications, organizations must provide for the highest degree of
fault tolerance to maintain these vital links to customers, partners
and employees.
Historically, the resiliency of the voice network meant disaster
planning budgets were focused in other areas of concern. Now,
organizations are deploying server-based PBXs, VoIP solutions,
conference bridges and call center systems. Providing new
capabilities at a fraction of the cost, these services are
increasingly critical to day-to-day business operations and, as
such, they demand expected levels of uptime. Built more frequently
from a combination of hardware and software vendors, however, the
likelihood is increasing that these systems experience failures.
Failures
can occur at any level. Server-based PBXs incorporate complex
software, often sourced from multiple vendors; disc arrays and other
hardware technologies can also fail. Network connectivity exposes
the system to malicious internal and external attacks. IP phones,
computer-telephony integration and Bluetooth have created additional
levels of complexity for the handset, typically the most reliable
part of the phone system. Phone lines are subject to occasional
outages due to cable breaks and component failures at the local and
long-distance carrier levels.
Redundancy solutions are a key part of planning for the inevitable.
Fail-safe options on every level, including failover servers,
diverse phone lines, media storage and hot sites can all minimize
unscheduled downtime and prevent real disasters.
Minimizing Failures on the Line Side: A highly available
solution attempts to eliminate single points of failure in all
aspects of a system’s design. To minimize failure options, network
planners need to evaluate both link redundancy and hardware
redundancy. Carriers can also provide diversity and avoidance to
help minimize risks.
Done at
both the local and long-distance levels, diversity refers to
redundant services and avoidance ensures that redundant services do
not share common facilities. Additionally, loop diversity provides
two redundant circuits from a local point-of-presence (POP) to your
facility. POP diversity, having local links originate from multiple
wire centers, or POPs, is an ideal solution. Interoffice diversity
provides the same level of service between wire centers.
Diversity services may or may not include the customer premise
equipment necessary to switch between redundant links. Protection
switching for redundant T-1 or DS-3 circuits is either provided by
the carrier or purchased and installed by the customer. Any service
degradation or interruption is automatically detected with
switchover to a spare circuit. Protection switching is either 1:1,
with a standby circuit for each primary, or 1:N, with a spare
circuit for one of several circuits.
Even
with VoIP solutions, any gateway to the public switched network
involves local loops and carrier services and it’s essential that
this critical link to customers is not overlooked.
Minimizing Failures on the Equipment Side: The migration of
telephony services to a server-based platform has prompted the wide
acceptance of un-PBXs and IP-PBXs. Advantages such as an open and
flexible architecture, standardized components, multiple sources and
lower costs mean servers are now common for phone systems, voice
mail, call recorders and other voice technologies. Although
processors, memory, storage, power supplies, telephony boards,
operating systems and application software are supplied by “best of
breed” vendors, it is, however, the end user or consultant who must
ensure all the pieces work together seamlessly.
For each
piece then, a critical metric is its availability, or readiness to
perform its stated function at any given time. Availability does
not take into account downtime for scheduled maintenance. By adding
lost time for maintenance, you arrive at true operational
availability. To achieve the best availability possible, system
engineers must look at maximizing reliability and minimizing both
scheduled and unscheduled downtime. Redundancy is the key to
providing both the maximum reliability and the minimum repair time.
Redundancy is commonplace for products at the component level that
are most likely to fail, such as disk drives, power supplies and
other mechanical items. Additionally, inexpensive redundant discs
(RAID) and power supplies are standard with most systems for
telephony services.
For more
complex applications, system level redundancy needs to be
evaluated. With hot standby systems, two complete, identical
systems are installed. The standby server, SNMP manager or other
management facility continuously monitors the health of the primary
system and, upon detecting any failures, automatically switches the
standby system into service. Deploying two systems also allows for
the performance of scheduled maintenance with minimal interruption,
as one server is upgraded while the other is in service. And, if a
planned upgrade or new software installation goes awry, the
organization has an immediate, graceful fallback.
Redundancy switches essentially perform the same function as a patch
panel, but do so automatically and simultaneously for all circuits.
They operate on the physical layer, moving the actual wires from
phone lines and operator instruments to telephony boards in the
system. With IP phone systems, switching is only required for the
phone line side. As the central component of a fault tolerant
solution, the redundancy switch itself cannot represent a single
point of failure as magnetic latching relays provide a continuous
mechanical connection in all circumstances.
Although
redundancy servers are optimal solutions, several issues need to be
considered. Databases must be synchronized so all configurations,
securities and call logs remain identical. Licensing is also a
factor. If redundant systems share the same licenses, they need to
have dongles switched or require an add-on module to support
automatic redundancy switchover.
Conclusion: For many types of organizations, voice technology
remains critical for effective corporate communications and customer
service. To maintain voice systems and ensure their optimal
operation, organizations must consider the level of fault tolerance
required and how redundancy solutions can help meet these
objectives. Deploying diversity and avoidance services as well as
protection switching, will help minimize failures on the line side.
In addition, standby systems and redundancy switches will mitigate
the risk of failures on the equipment side. Unplanned downtime is
clearly not acceptable in most businesses, and redundancy solutions
maximize uptime and allow voice communications to continue in the
event of system failures.
David
Weiss has nearly 20 years experience in product management, business
development, sales and marketing and is an expert in the remote site
management technology industry. David serves as the president and
CEO of Dataprobe,
a remote site management and monitoring solutions provider.
[Contact
the author for permission to republish or reuse this article.]
|