Back to EveryPatent.com
United States Patent |
6,105,089
|
Chari
,   et al.
|
August 15, 2000
|
Data management system for adding or exchanging components on a running
computer
Abstract
A data management system supports hot plug operations on a computer by
defining, organizing, and maintaining hot plug variables, stored on a
computer readable medium, which identify components of a computer that may
be involved in hot plug operations, and which also identify capabilities
and operational states of those components as well as control their
operation and interface to the computer. The hot plug variables identify a
component as well as represent states and capabilities of a component, and
thus the hot plug variables advantageously operate as commands to
predetermined components of a computer which support hot plug operations
on the computer. The system generally comprises a plurality of variables
to support adding a component to or exchanging components of a computer
while the computer runs, as well as hot plug variable data stored in a
computer readable medium, the hot plug variable data representing
capabilities, characteristics or states of components of the computer, the
hot plug variable data related to the plurality of variables. In one
embodiment, the system comprises management software running on a client
computer, the management software generating requests such as a suspend
device driver request, a power off request, a power on request, and a
resume device driver request, and sending the requests over a network. In
the embodiment, the system further comprises agent software running on a
server computer, the agent software responsively recognizing the
respective requests to suspend a device driver running on the server
computer, to stop power to a component of the server computer, to start
power to the server computer, and to resume the device driver.
Inventors:
|
Chari; Srikumar N. (Cupertino, CA);
Bright; Kenny L. (Hayward, CA);
Sartirana; Bruno (Sunnyvale, CA)
|
Assignee:
|
Micron Electronics, Inc. (Nampa, ID)
|
Appl. No.:
|
942129 |
Filed:
|
October 1, 1997 |
Current U.S. Class: |
710/302; 710/18; 713/310; 713/320; 713/324 |
Intern'l Class: |
G06F 013/00 |
Field of Search: |
395/281-283,828-838,500,613,200.55,200.56,200.57,681
364/514 A,514 B,514 C,514 R,240,240.1
713/310,320,324,340
|
References Cited
U.S. Patent Documents
4999787 | Mar., 1991 | McNally et al. | 364/514.
|
5123017 | Jun., 1992 | Simpkins et al. | 371/15.
|
5210855 | May., 1993 | Bartol | 395/500.
|
5261044 | Nov., 1993 | Dev et al. | 345/357.
|
5272584 | Dec., 1993 | Austruy et al. | 631/58.
|
5337413 | Aug., 1994 | Lui et al. | 395/275.
|
5340340 | Aug., 1994 | Hasting et al. | 439/64.
|
5379409 | Jan., 1995 | Ishikawa | 395/575.
|
5386567 | Jan., 1995 | Lien et al. | 395/700.
|
5471617 | Nov., 1995 | Farrand et al. | 395/700.
|
5487148 | Jan., 1996 | Komorie et al. | 395/182.
|
5491796 | Feb., 1996 | Wanderer et al. | 395/200.
|
5513314 | Apr., 1996 | Kandasamy et al. | 395/182.
|
5517646 | May., 1996 | Piccirillo et al. | 395/700.
|
5533198 | Jul., 1996 | Thorson | 395/200.
|
5539883 | Jul., 1996 | Allon et al. | 395/200.
|
5546595 | Aug., 1996 | Norman et al. | 395/800.
|
5555510 | Sep., 1996 | Verseput et al. | 364/514.
|
5564024 | Oct., 1996 | Pemberton | 395/283.
|
5576946 | Nov., 1996 | Bender et al. | 364/146.
|
5579491 | Nov., 1996 | Jeffries et al. | 395/283.
|
5581712 | Dec., 1996 | Herrman | 395/283.
|
5592610 | Jan., 1997 | Chittor | 395/182.
|
5604873 | Feb., 1997 | Fite et al. | 395/283.
|
5621892 | Apr., 1997 | Cook | 395/200.
|
5632021 | May., 1997 | Jennings et al. | 395/309.
|
5638289 | Jun., 1997 | Yamada et al. | 364/489.
|
5644731 | Jul., 1997 | Liencres et al. | 395/283.
|
5651006 | Jul., 1997 | Fujino et al. | 370/408.
|
5652892 | Jul., 1997 | Ugajin | 713/300.
|
5655081 | Aug., 1997 | Bonnell et al. | 395/200.
|
5671441 | Sep., 1997 | Glassen et al. | 395/828.
|
5678042 | Oct., 1997 | Pisello et al. | 395/610.
|
5684945 | Nov., 1997 | Chen et al. | 395/182.
|
5689637 | Nov., 1997 | Johnson et al. | 395/182.
|
5745897 | Apr., 1998 | Perkins et al. | 707/101.
|
5751575 | May., 1998 | Hirosawa et al. | 364/188.
|
5754426 | May., 1998 | Dumais | 364/188.
|
5758103 | May., 1998 | Oh | 395/283.
|
5761085 | Jun., 1998 | Giorgio | 702/333.
|
5761429 | Jun., 1998 | Thompson | 395/200.
|
5764911 | Jun., 1998 | Tezuka et al. | 395/200.
|
5764913 | Jun., 1998 | Jancke et al. | 395/200.
|
5768541 | Jun., 1998 | Pan-Ratzlaff | 395/283.
|
5774667 | Jun., 1998 | Garvey et al. | 395/200.
|
5781798 | Jul., 1998 | Beatty et al. | 395/830.
|
5784576 | Jul., 1998 | Guthrie et al. | 395/283.
|
5787246 | Jul., 1998 | Litchtman et al. | 395/200.
|
5812750 | Sep., 1998 | Dev et al. | 395/182.
|
5826046 | Oct., 1998 | Nguyen et al. | 395/309.
|
5838319 | Nov., 1998 | Guzak et al. | 345/340.
|
5901304 | May., 1999 | Hwang et al. | 395/500.
|
5913037 | Jun., 1999 | Spofford et al. | 395/200.
|
5922051 | Jul., 1999 | Sidey | 709/223.
|
Other References
Standard Overview, http://www.pc-card.com/stand.sub. --overview.html#1, 9
pages, Jun. 1990, "Detailed Overview of the PC Card Standard."
Digital Equipment Corporation, datasheet, 140 pages, 1993, "DECchip 21050
PCI-to-PCI Bridge."
NetFrame Systems Incorporated, News Release, 3 pages, referring to May 9,
1994, "NetFrame's New High-Availability ClusterServer Systems Avoid
Scheduled as well as Unscheduled Downtime."
Compaq Computer Corporation, Phenix Technologies, LTD, and Intel
Corporation, specification, 55 pages, May 5, 1995, "Plug & Play BIOS
Specification."
NetFrame Systems Incorporated, datasheet, Feb. 1996, "NF450FT Network
Mainframe."
NetFrame Systems Incorporated, datasheet, Mar. 1996, "NetFrame Cluster
Server 8000."
Joint work by Intel Corporation, Compaq, Adaptec, Hewlett Packard, and
Novell, presentation, 22 pages, Jun. 1996, "Intelligent I/O Architecture."
Lockareff, M., HTINews,
http://www.hometoys.com/htinews/dec96/articles/loneworks.htm, Dec. 1996,
"Loneworks--An Introduction."
Schofield, M.J., http://www.omegas.co.uk/CAN/canworks.htm, Copyright 1996,
1997, "Controller Area Network--How CAN Works."
NRTT, Ltd., http://www.nrtt.demon.co.uk/cantech.html, 5 pages, May 28,
1997, "CAN: Technical Overview."
PCI Special Interest Group, specification, 35 pages, Draft For Review Only,
Jun. 15, 1997, "PCI Bus Hot Plug Specification."
Microsoft Corporation, file:///A.vertline./Rem.sub. --devs.htm, 4 pages,
Copyright 1997, updated Aug. 13, 1997, "Supporting Removable Devices Under
Windows and Windows NT."
IBM Technical Disclosure Bulletin, 35(2): 69-74, Jul. 1992, "Serial Channel
Synchronizer"
IBM Technical Disclosure Bulletin, 39(7):229-230, Jul. 1996.
|
Primary Examiner: Sheikh; Ayaz R.
Assistant Examiner: Phan; Raymond N
Attorney, Agent or Firm: Knobbe, Martens, Olson and Bear, LLP
Parent Case Text
RELATED APPLICATIONS
The subject matter of U.S. patent application entitled "Data Management
Method Supporting Hot Plug Operations On A Computer," filed on Oct. 1,
1997, application Ser. No. 08/942,124, and is related to this application.
PRIORITY CLAIM
The benefit under 35 U.S.C. .sctn. 119(e) of the U.S. provisional
application Ser. No. 60/046,310, entitled "High Performance Network Server
System Management Interface," and filed on May 13, 1997, is hereby claimed
.
Claims
What is claimed is:
1. A system for managing an addition of a component to a computer,
comprising:
a computer having an expansion slot for connecting to a component;
a component which can be connected to said computer;
an extended management information base, including a plurality of
variables, said variables selected to support adding said component to
said computer while said computer runs, said management information base
stored on a computer readable medium;
a component add software module stored in executable form on said computer,
said component add software module including at least some of said
plurality of variables, said component add software module directing
instructions to said slot to permit adding said component to said computer
while said computer runs;
a power state variable for determining and regulating a power state of said
expansion slot of said computer, said expansion slot capable of being
powered down individually; and
hot plug software running on said computer, said hot plug software
responsively recognizing a power state command to stop power to said
expansion slot or to start power to said expansion slot, said power state
command related to said power state variable.
2. The system as described in claim 1, further comprising:
a component command variable for configuring or changing an operational
state of said component, said component command variable included in said
variables; and
a component command related to said component command variable, said hot
plug software responsively recognizing said component command to configure
or change the operating state of said component.
3. A system for managing an exchange of components of a computer,
comprising:
a computer having an expansion slot for connecting to a component;
a first component connected to said expansion slot;
an extended management information base, including a plurality of
variables, said variables selected to support exchanging said first
component with another component while said computer runs, said management
information base stored on a computer readable medium;
a component exchange software module stored in executable form on said
computer, said component exchange software module including at least some
of said plurality of variables, said component exchange software module
directing instructions to said slot to permit exchanging said first
component with another component while said computer runs;
a component state variable for suspending or resuming a device driver, said
component state variable included in said plurality of variables;
a device driver running on said computer and servicing said first
component; and
hot plug software running on said computer, said hot plug software
responsively recognizing said component state command to suspend or resume
the operation of said device driver, said component state command
formatted to include a representation of said component state variable.
4. The system as described in claim 3, further comprising:
a power state variable for determining the power state of said expansion
slot, said power state variable included in said plurality of variables;
and
a power state command related to said power state variable, said hot plug
software responsively recognizing said power state command to determine
the power state of said expansion slot, to shut off power to said
expansion slot, or to start power to said expansion slot.
5. The system as described in claim 4, further comprising:
a compiled management information base stored on a computer readable
medium, said power state variable and said component state variable
defined in said compiled management information base.
6. A system for managing hot plug data to hot add or hot swap components of
a computer, comprising:
a computer having an expansion slot;
a component capable of connecting to said expansion slot;
hot plug variable data stored on a computer readable medium, said hot plug
variable data corresponding to hot plug variables selected to support
adding said component to said computer while said computer runs, removing
said component from said expansion slot while said computer runs, and
replacing said component connected to said expansion slot with another
component while said computer runs;
a component add software module stored in executable form on said computer,
said component add software module including or accessing at least some of
said hot plug variable data, said component add software module directing
instructions to said slot to permit adding said component to said computer
while said computer runs;
a power state variable for examining or regulating the power state of said
expansion slot, said power state variable included in said hot plug
variables; and
hot plug software running on said computer, the hot plug software polling
said expansion slot to obtain power state variable data, the hot plug
software storing said obtained power state variable data with said hot
plug variable data on said computer readable medium.
7. The system as described in claim 6, further comprising:
agent software running on said computer, said agent software responsively
recognizing said power state command to determine the power state of said
expansion slot, to shut off power to said expansion slot, or to start
power to said expansion slot, the power state command including a
representation of said power state variable data.
8. The system as described in claim 7, further comprising:
a device driver running on said computer and servicing said component;
a component state variable included in said hot plug variables; and
a component state command for suspending or resuming operation of a device
driver, said agent software responsively recognizing said component state
command to suspend or resume operation of said device driver, said
component state command related to said component state variable.
9. The system as described in claim 8, further comprising:
a compiled management information base stored on computer storage media
accessible to said computer, said compiled management information base
defining said hot plug variables.
10. A system for controlling hot plug operations on a server computer,
comprising:
a server computer having server-based hot plug management information base
variables stored on a computer readable medium;
agent software running on said server computer, said agent software
responsive to hot plug-related requests to perform hot plug operations on
said server computer, said hot plug commands formatted to include a
representation of at least one of said hot plug management information
base variables, said requests including a request to shut off power to a
component of said server computer and a request to start power to a
component of said server computer; and
a compiled management information base stored on a computer readable
medium, said compiled management information base defining for said agent
software a structure for each of said hot plug management information base
variables, said agent software collecting said hot plug management
information base variables by polling components of said computer for
characteristic, capability and state information corresponding to said
respective structures for each management information base variable.
11. A system for controlling hot plug operations on a server computer,
comprising:
a server computer having server-based hot plug management information base
variables stored on a computer readable medium;
agent software running on said server computer, said agent software
responsive to hot plug-related requests to perform hot plug operations on
said server computer, said hot plug commands formatted to include a
representation of at least one of said hot plug management information
base variables, said requests including a request to shut off power to a
component of said server computer and a request to start power to a
component of said server computer;
a compiled management information base stored on a computer readable
medium, said compiled management information base defining for said agent
software a structure for each of said hot plug management information base
variables, said agent software collecting said hot plug management
information base variables by polling components of said computer for
characteristic, capability and state information corresponding to said
respective structures for each management information base variable;
a client computer having client-based hot plug management information base
variables stored on a computer readable medium, said client-based hot plug
management information base variables corresponding to characteristics,
capabilities or states of components of said server computer;
management software running on said client computer, said management
software generating hot plug-related requests, said requests including a
request to shut off power to a component and a request to start power to a
component; and
a network permitting said client computer and said server computer to
communicate, said management software sending said requests to said agent
software over said network.
12. A system for refreshing hot plug variables, comprising:
a server computer;
a client computer;
a network permitting said client computer and said server computer to
communicate;
hot plug variable data stored on a computer readable medium local to said
server computer, said hot plug variable data including existing component
variable data identifying at least one component of said server computer;
agent software running on said server computer, said agent software polling
at least one component of said server computer for new component variable
data identifying said at least one component of said server computer, said
agent software storing said new component variable data on said computer
readable medium to supersede said existing component variable data; and
a retrieve data routine running on said client computer at a predetermined
time interval, said retrieve data routine generating requests for said hot
plug variable data and sending said requests to said agent software over
said network, said agent software responding by sending over said network
said hot plug variable data which supersedes hot plug variable data stored
on a computer readable medium of said client computer.
13. The system as described in claim 12, further comprising:
a power state variable corresponding to the power state of at least one
component of said first computer, said agent software polling at least one
component of said server computer for new power state variable data and
storing obtained new power state variable data to supersede existing power
state variable data, said requests generated by said retrieve data routine
including requests for power state variable data, said agent software
polling at least one component of said server computer for new power state
variable data and sending said new power state variable data over said
network to supersede power state variable data stored on a computer
readable medium of said client computer.
14. A system for broadcasting hot plug variables, comprising:
a server computer;
a client computer;
a network permitting said client computer and said server computer to
communicate;
hot plug variable data stored on a computer readable medium local to said
server computer, said hot plug variable data including existing component
variable data identifying at least one component of said server computer;
agent software running on said server computer, said agent software polling
at least one component of said server computer for new component variable
data identifying said at least one component of said server computer, said
agent software storing said new component variable data on said computer
readable medium to supersede said existing component variable data; and
a broadcast routine running on said server computer, said broadcast routine
sending said hot plug variable data over said network to said client
computer.
15. The system described in claim 14, further comprising:
a software timer for producing a timer event at a predetermined time
interval, said broadcast routine responsive to said timer event to execute
at said predetermined time intervals.
16. The system described in claim 15, further comprising:
a compiled management information base defining said hot plug variables,
said compiled management information base stored on a computer readable
medium of said server computer.
17. A system for managing hot plug operations, comprising:
a binary hot plug management information base stored on computer readable
media accessible to a server computer, said binary hot plug management
information base generated by compiling with a management information base
compiler a hot plug management information base module, said hot plug
management information base module defining hot plug variables selected to
support adding a component to said server computer while said server
computer runs, removing a component from said server computer while said
server computer runs, and exchanging components of said server computer
while said server computer runs, said hot plug variables including a power
state variable for regulating the power state of at least one component of
said server computer and also including a component state variable for
suspending or resuming the operation of at least one component of said
server computer;
hot plug software running on said server computer and responsive to
commands related to variables defined in said binary hot plug management
information base;
a client computer linked to said server computer by a network, said binary
hot plug management information base stored on computer readable medium of
said client computer; and
hot plug management software running on said client computer, said hot plug
management software generating requests to perform hot plug operations,
said requests including a request to shut down power to a component of
said server computer and a request to start power to a component of said
server computer, said requests also including a request to suspend the
operation of a component of said server computer and a request to resume
the operation of a component of said server computer, each of said
requests related to a hot plug variable defined in said binary hot plug
management information base, said hot plug management software sending
said requests over said network to said hot plug software.
18. A computer system for controlling, from a client computer, hot plug
operations on a server computer, comprising:
a client computer;
hot plug management software running on said client computer;
a server computer having an expansion slot and a component removably
connected to said expansion slot;
a device driver running on said server computer and servicing said
component;
hot plug agent software running on said server computer;
a compiled management information base defining hot plug variables, said
compiled management information base stored on a computer readable medium
of said server computer, and a copy of said compiled management
information base stored on a computer readable medium of said client
computer;
server-based hot plug management information base variables stored on a
computer readable medium of said server computer;
client-based hot plug management information base variables stored on a
computer readable medium of said server computer;
a suspend component state command generated by said hot plug management
software to suspend the operation of said device driver, said suspend
component state command related to a component state hot plug variable
defined in said compiled management information base, said hot plug
management software sending said suspend component state command to said
hot plug agent, said hot plug agent responsively recognizing said suspend
component state command to suspend operation of said device driver;
a power off command generated by said hot plug management software to stop
power to said expansion slot, said power off command related to a power
state hot plug variable defined in said compiled management information
base, said hot plug management software sending said power off command to
said hot plug agent, said hot plug agent responsively recognizing said
power off command to stop power to said expansion slot;
a power on command generated by said hot plug management software to start
power to said expansion slot, said power off command related to a power
state hot plug variable defined in said compiled management information
base, said hot plug management software sending said power on command to
said hot plug agent, said hot plug agent responsively recognizing said
power on command to start power to said expansion slot; and
a resume component state command generated by said hot plug management
software to resume the operation of said device driver, said resume
component state command related to a component state hot plug variable
defined in said compiled management information base, said hot plug
management software sending said resume component state command to said
hot plug agent, said hot plug agent responsively recognizing said resume
component state command to resume operation of said device driver.
19. A system for controlling hot plug operations, comprising:
a server computer having an expansion slot and a component removably
connected to said expansion slot;
a device driver running on said server computer and servicing said
component;
a database of hot plug commands, said database defining the structure of
power state commands for determining or regulating the power state of said
expansion slot, said database defining the structure of component state
commands for suspending or resuming the operation of said device driver,
and said database defining the structure of component commands for
configuring or changing the operating condition of said component; and
hot plug software running on said server computer, said hot plug software
responsively recognizing said power state commands to determine or
regulate the power state of said expansion slot, said hot plug software
responsively recognizing said component state commands to suspend or
resume the operation of said device driver, and said hot plug software
responsively recognizing said component commands to configure or change
the operating condition of said component.
20. A system for controlling hot plug operations, comprising:
a server computer having an expansion slot and a component removably
connected to said expansion slot;
a device driver running on said server computer and servicing said
component;
a database of hot plug commands, said database defining the structure of
power state commands for determining or regulating the power state of said
expansion slot, said database defining the structure of component state
commands for suspending or resuming the operation of said device driver,
and said database defining the structure of component commands for
configuring or changing the operating condition of said component;
hot plug software running on said server computer, said hot plug software
responsively recognizing said power state commands to determine or
regulate the power state of said expansion slot, said hot plug software
responsively recognizing said component state commands to suspend or
resume the operation of said device driver, and said hot plug software
responsively recognizing said component commands to configure or change
the operating condition of said component;
a client computer;
a network permitting said client computer and said server computer to
communicate;
a copy of said database of hot plug commands stored on a computer readable
medium of said client computer; and
hot plug management software running on said client computer, said hot plug
management software generating said power state commands, said component
state commands, and said component commands, each of said commands related
to respective structure definitions in said database of hot plug commands,
said hot plug management software sending said commands over the network
to said hot plug software.
Description
APPENDICES
Appendix A, which forms a part of this disclosure, is a list of commonly
owned copending U.S. patent applications. Each one of the applications
listed in Appendix A is hereby incorporated herein in its entirety by
reference thereto.
Appendix B, which forms part of the disclosure of this patent application,
is a copy of the U.S. provisional patent application, 122 pages, filed May
13, 1997, entitled "High Performance Network Server System Management
Interface" and assigned Application No. 60/046,310. Page 1, line 7 of the
provisional application has been changed from the original to positively
recite that the entire provisional application, including the attached
documents, forms part of this disclosure.
COPYRIGHT RIGHTS
A portion of the disclosure of this patent document contains material which
is subject to copyright protection. The copyright owner has no objection
to the facsimile reproduction by anyone of the patent document or the
patent disclosure, as it appears in the Patent and Trademark Office patent
files or records, but otherwise reserves all copyright rights whatsoever.
FIELD OF THE INVENTION
The invention relates to the field of computers and computer software. In
particular, the invention relates to managing configurations of computers.
BACKGROUND OF THE INVENTION
In response to organizations' increasing reliance on network-based server
computers and the increasing cost of computer downtime, manufacturers
developed fault tolerant or redundant systems designed to reduce downtime.
Such systems typically use memory back up and redundant components in
attempting to provide continuous system operation. Many redundant systems
can be found in the prior art.
For example, U.S. Pat. No. 4,607,365 to Greig, et al., discloses a system
that automatically selects secondary components as needed to compensate
for faults in the system. Similarly, U.S. Pat. No. 4,727,516 to Yoshida,
et al. discloses redundant memory arrays and U.S. Pat. Nos. 4,484,275 and
4,378,588 to Katzman et al. teach multiple processors. While those
redundant computer systems may prevent a complete server failure in some
cases, those systems do not address many causes of computer downtime.
Studies show that a significant percentage of network server downtime is
caused by transient faults in the I/O subsystem. These faults may be due,
for example, to adapter card firmware, or hardware which does not properly
handle concurrent errors, and often causes servers to crash or hang.
Diagnosing intermittent errors can be a frustrating and time-consuming
process. The result is hours of downtime per failure, while a system
administrator discovers the failure, takes some action, and manually
reboots the server. The computer systems of the prior art do not provide a
computer system manager with the tools needed to keep computers running
while failed parts are removed and repaired or while upgrades are
performed.
Moreover, even if hardware components of a server computer can withstand
being added or removed without shutting down the server computer or making
it unavailable, a system manager could not simply remove a piece of
hardware and plug in another piece without causing immense disruption of
the software. Such a physical swap would cause hundreds or thousands of
error conditions every few seconds, likely resulting in corruption of data
and possibly even systemwide software failure. Low level software modules,
particularly device drivers, must be carefully administered during any
change to the hardware components they service. Making matters more
difficult, device drivers are among the most complicated and least
understood classes of software, few of them alike, but nearly all having
arbitrary and arcane command sets.
Without some tool to provide guidance and uniformity, network
administrators could only add or remove components to an operating
computer by issuing precise sequences of arcane, error-prone commands
having difficult-to-remember, numeric-range parameter values, interspersed
with a variety of hardware manipulations, with little or no feedback
during the entire process to indicate successful progress. Moreover,
completely different sets of commands and parameter values may be required
to perform hot plug operations on differing components, or on similar
components from differing vendors. Both the high possibility of making
mistakes and the steep learning curve make manual performance of hot plug
operations impractical at best.
Industry focus and cooperation on computer system management has prompted
the development of standards for performing routine management operations
on computers. Today's standards generally provide databases containing a
wide variety of management information needed to carry out many computer
system management tasks. While the standard practices used to manage
computers are becoming more uniform and effective as growing numbers of
computer system managers learn, implement and improve these standards,
there has been little if any focus on the area of adding or removing
components to a running, operating computer.
SUMMARY OF THE INVENTION
One embodiment of the present invention is a system for managing an
addition of a component to a computer. The system comprises: (1) a
computer having an expansion slot for connecting to a component; (2) a
component which can be connected to the computer; (3) an extended
management information base, including a plurality of variables, the
variables selected to support adding the component to the computer while
the computer runs, the management information base stored on a computer
readable medium; and (4) a component add software module stored in
executable form on the computer, the component add software module
including at least some of the plurality of variables, the component add
software module directing instructions to the slot to permit adding the
component to the computer while the computer runs.
A further embodiment of the present invention is a system for managing an
exchange of components of a computer. The system comprises (1) a computer
having an expansion slot for connecting to a component; (2) a first
component connected to the expansion slot; (3) an extended management
information base, including a plurality of variables, the variables
selected to support exchanging the first component with another component
while the computer runs, the management information base stored on a
computer readable medium; and (4) a component exchange software module
stored in executable form on the computer, the component exchange software
module including at least some of the plurality of variables, the
component add software module directing instructions to the slot to permit
exchanging the first component with another component while the computer
runs.
Another embodiment of the present invention is a system for managing hot
plug data to hot add or hot swap components of a computer. The system
comprises: (1) a computer having an expansion slot; (2) a component
capable of connecting to the expansion slot; (3) hot plug variable data
stored on a computer readable medium, the hot plug variable data
corresponding to hot plug variables selected to support adding the
component to the computer while the computer runs, removing the component
from the expansion slot while the computer runs, and replacing the
component connected to the expansion slot with another component while the
computer runs; and (4) a component add software module stored in
executable form on the computer, the component add software module
including or accessing at least some of the hot plug variable data, the
component add software module directing instructions to the slot to permit
adding the component to the computer while the computer runs.
A still further embodiment of the present invention is a system for
controlling hot plug operations on a server computer. The system
comprises: (1) a server computer having server-based hot plug MIB
variables stored on a computer readable medium; (2) agent software running
on the server computer, the agent software responsive to hot plug-related
requests to perform hot plug operations on the server computer, the hot
plug commands formatted to include a representation of at least one of the
hot plug MIB variables, the requests including a request to shut off power
to a component of the server computer and a request to start power to a
component of the server computer; and (3) a compiled MIB stored on a
computer readable medium, the compiled MIB defining for the agent software
a structure for each of the hot plug MIB variables, the agent software
collecting the hot plug MIB variables by polling components of the
computer for characteristic, capability and state information
corresponding to the respective structures for each MIB variable.
Another embodiment of the present invention is a system for refreshing hot
plug variables. The system comprises: (1) a server computer; (2) a client
computer; (3) a network permitting the client computer and the server
computer to communicate; (4) hot plug variable data stored on a computer
readable medium local to the server computer, the hot plug variable data
including existing component variable data identifying at least one
component of the server computer; (5) agent software running on the server
computer, the agent software polling at least one component of the server
computer for new component variable data identifying the at least one
component of the server computer, the agent software storing the new
component variable data on the computer readable medium to supersede the
existing component variable data; and (6) a retrieve data routine running
on the client computer at a predetermined time interval, the retrieve data
routine generating requests for the hot plug variable data and sending the
requests to the agent software over the network, the agent software
responding by sending over the network the hot plug variable data which
supersedes hot plug variable data stored on a computer readable medium of
the client computer.
A further embodiment of the present invention is a system for broadcasting
hot plug variables. The system comprises: (1) a server computer; (2) a
client computer;
(3) a network permitting the client computer and the server computer to
communicate;
(4) hot plug variable data stored on a computer readable medium local to
the server computer, the hot plug variable data including existing
component variable data identifying at least one component of the server
computer; (5) agent software running on the server computer, the agent
software polling at least one component of the server computer for new
component variable data identifying the at least one component of the
server computer, the agent software storing the new component variable
data on the computer readable medium to supersede the existing component
variable data; and (6) a broadcast routine running on the server computer,
the broadcast routine sending the hot plug variable data over the network
to the client computer.
A still further embodiment of the present invention is a system for
managing hot plug operations. The system comprises: (1) a binary hot plug
MIB stored on computer readable media accessible to a server computer, the
binary hot plug MIB generated by compiling with a MIB compiler a hot plug
MIB module, the hot plug MIB module defining hot plug variables selected
to support adding a component to the server computer while the server
computer runs, removing a component from the server computer while the
server computer runs, and exchanging components of the server computer
while the server computer runs, the hot plug variables including a power
state variable for regulating the power state of at least one component of
the server computer and also including a component state variable for
suspending or resuming the operation of at least one component of the
server computer; (2) hot plug software running on the server computer and
responsive to commands related to variables defined in the binary hot plug
MIB; (3) a client computer linked to the server computer by a network, the
binary hot plug MIB stored on computer readable medium of the client
computer; and (4) hot plug management software running on the client
computer, the hot plug management software generating requests to perform
hot plug operations, the requests including a request to shut down power
to a component of the server computer and a request to start power to a
component of the server computer, the requests also including a request to
suspend the operation of a component of the server computer and a request
to resume the operation of a component of the server computer, each of the
requests related to a hot plug variable defined in the binary hot plug
MIB, the hot plug management software sending the requests over the
network to the hot plug software.
Another embodiment of the present invention is a system for controlling,
from a client computer, hot plug operations on a server computer. The
system comprises; (1) a client computer; (2) hot plug management software
running on the client computer; (3) a server computer having an expansion
slot and a component removably connected to the expansion slot; (4) a
device driver running on the server computer and servicing the component;
(5) hot plug agent software running on the server computer; (6) a compiled
MIB defining hot plug variables, the compiled MIB stored on a computer
readable medium of the server computer, and a copy of the compiled MIB
stored on a computer readable medium of the client computer; (7)
server-based hot plug MIB variables stored on a computer readable medium
of the server computer; (8) client-based hot plug MIB variables stored on
a computer readable medium of the server computer; (9) a suspend component
state command generated by the hot plug management software to suspend the
operation of the device driver, the suspend component state command
related to a component state hot plug variable defined in the compiled
MIB, the hot plug management software sending the suspend component state
command to the hot plug agent, the hot plug agent responsively recognizing
the suspend component state command to suspend operation of the device
driver; (10) a power off command generated by the hot plug management
software to stop power to the expansion slot, the power off command
related to a power state hot plug variable defined in the compiled MIB,
the hot plug management software sending the power off command to the hot
plug agent, the hot plug agent responsively recognizing the power off
command to stop power to the expansion slot; (11) a power on command
generated by the hot plug management software to start power to the
expansion slot, the power off command related to a power state hot plug
variable defined in the compiled MIB, the hot plug management software
sending the power on command to the hot plug agent, the hot plug agent
responsively recognizing the power on command to start power to the
expansion slot; and (12) a resume component state command generated by the
hot plug management software to resume the operation of the device driver,
the resume component state command related to a component state hot plug
variable defined in the compiled MIB, the hot plug management software
sending the resume component state command to the hot plug agent, the hot
plug agent responsively recognizing the resume component state command to
resume operation of the device driver.
A further embodiment of the present invention is a system for controlling
hot plug operations. The system comprises: (1) a server computer having an
expansion slot and a component removably connected to the expansion slot;
(2) a device driver running on the server computer and servicing the
component; (3) a database of hot plug commands, the database defining the
structure of power state commands for determining or regulating the power
state of the expansion slot, the database defining the structure of
component state commands for suspending or resuming the operation of the
device driver, and the database defining the structure of component
commands for configuring or changing the operating condition of the
component; and (4) hot plug software running on the server computer, the
hot plug software responsively recognizing the power state commands to
determine or regulate the power state of the expansion slot, the hot plug
software responsively recognizing the component state commands to suspend
or resume the operation of the device driver, and the hot plug software
responsively recognizing the component commands to configure or change the
operating condition of the component.
Still another embodiment of the present invention is a system for updating
hot plug data for a component connected to a computer. The system
comprises: (1) a compiled hot plug MIB defining hot plug variables for a
class of components capable of connecting to a computer; (2) hot plug MIB
variables stored in a computer readable medium of the computer, the hot
plug MIB variables corresponding to hot plug variables defined in the
compiled hot plug MIB; (3) hot plug polling software running on the
computer, the hot plug polling software polling a component of the class
of components after the component is connected to the computer to obtain
hot plug variable data for the component; and (4) hot plug management
software running on the computer, the management software updating the hot
plug MIB variables to contain the obtained hot plug variable data.
These and other embodiments of the present invention will be readily
apparent to those skilled in the art having reference to the detailed
description and drawings which follow, the invention not being limited,
however, to any particular embodiments disclosed.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 illustrates a block diagram of a network having a managing computer
and a managed computer;
FIG. 2 illustrates a block diagram of a computer network having a client
computer and a server computer which communicate to manage the
configuration of the server computer;
FIG. 3 illustrates a diagram representing the nodes of an extended MIB;
FIG. 4 illustrates a block diagram representing the process of compiling
and distributing a MIB;
FIG. 5 illustrates one sequence of steps performed to acquire hot plug MIB
variables for a server computer;
FIG. 6 illustrates one sequence of steps performed to hot plug an adapter
to a server computer;
FIG. 7 illustrates an I/O expansion slot arrangement and bus configuration
of a server computer;
FIG. 8 illustrates one sequence of steps performed to hot plug an adapter
to a server computer having canisters;
FIG. 9 illustrates an I/O expansion slot arrangement and bus configuration
of a server computer having canisters; and
FIG. 10 illustrates one sequence of steps performed to hot swap an adapter
to a server computer having canisters.
DETAILED DESCRIPTION OF THE INVENTION
Administrators of centralized server computers need a way to remove failed
server components and add functioning components without shutting the
servers down. Further, computer system administrators need databases of
component information and related management tools to insulate them from
the complexities of low level software and to make component add and
remove operations uniform, practical and risk free.
The present invention concerns software tools to manage server computers.
In particular, the present invention is directed to selecting, acquiring,
and maintaining groups of data values for adding or removing components of
a computer while that computer is running. The groups of data values
permit system managers using system management software tools to identify
components of a server computer as well as the capabilities of those
components, and also to control the operation of the components and their
interface to the server computer.
Generally, the groups of data values correspond to characteristics,
capabilities or states of components of a managed server computer which
may be added to or removed from a running computer. These components
include, but are not limited to, processors, memory, expansion cards,
expansion slots, adapter cards, cooling fans, keyboards, monitors,
pointing devices, power supplies, I/O ports, floppy disk drives, hard disk
drives, cd-rom drives, device drivers, and data files. The
characteristics, capabilities or states of certain components may need to
be tested, examined or altered by system management software or agent
software during a "hot plug" operation. Hot plug operations include hot
add operations (i.e., a new component is added to the managed computer
without shutting it down, making it unavailable to a user, or halting its
execution of instructions) and hot swap operations (i.e., a component is
removed and optionally replaced by another component without shutting down
the managed computer, making it unavailable to users, or halting its
execution of instructions).
The present invention permits a network administrator to manage and control
the process of adding and removing computer components while a computer is
running. The present invention supports hot plug operations by defining,
organizing, and maintaining hot plug variables which identify components
of a computer that may be involved in hot plug operations, and which also
identify capabilities and operational states of those components as well
as control their operation and interface to the computer. Grouping the hot
plug variables according to particular components provides efficient
access by software modules to convenient subsets of hot plug variables
which may be needed to manipulate the operation of a component during a
particular hot plug operation. Because the hot plug variables both
identify a component as well as represent states and capabilities of a
component, the hot plug variables advantageously operate as commands to
predetermined components.
In one embodiment, groups of hot plug variables representing components of
one computer may be stored in a computer readable medium on a remote
computer. By forming requests and commands from the groups of hot plug
variables, and by sending these over a network (or some other
communication means) to a responsive process running on the first
computer, the remote computer advantageously controls hot plug operations
on the first computer.
A remote computer, in one embodiment, advantageously retrieves current hot
plug variables from another computer by executing, at predetermined time
intervals, a retrieve data routine designed to request, retrieve and store
hot plug variables. The remote computer thereby maintains current
information regarding the status and capabilities of components in a
managed computer. In an alternative embodiment, software on a managed
computer polls the components of that computer to collect hot plug
variable data, stores the hot plug variable data in computer readable
media, and, either at predetermined time intervals, or upon sensed
configuration changes, causes the hot plug variables to be broadcast to
one or more remote computers.
Advantageously, the present invention is consistent with and may extend
standard processes widely used to perform computer management tasks. In
one such embodiment, the present invention advantageously leverages
existing software management tools by defining and organizing hot plug
variables according to a hierarchical data storage model called a
management information base (MIB). Using the widely implemented MIB model
to define and organize hot plug variables allows the present invention to
support hot plug operations independent of any particular operating system
or computer environment and also facilitates performing hot plug
operations remotely, over a network.
Computer management software tools complying with SNMP (Simple Network
Management Protocol) standards use MIB-based variables to cooperate in
managing computers. These software tools routinely operate over a network,
with management software running on a client computer issuing
management-related requests to responsive agent software running on a
server, and with MIB variables forming the basis of each such request.
Thus, extending a MIB to define and organize hot plug MIB variables
expands the scope of operations which SNMP management software can control
and with which SNMP agent software can cooperate.
In the following description of the invention, a module includes, but is
not limited to, software or hardware components which perform certain
tasks. Thus, a module may include object-oriented software components,
class components, procedures, subroutines, data structures, segments of
program code, drivers, firmware, microcode, circuitry, data, data
structures, tables, arrays, etc. In addition, those with ordinary skill in
the art will recognize that a module can be implemented using a wide
variety of different software and hardware techniques.
FIG. 1 illustrates a network 100 that enables a managing computer 102 to
exchange data with a managed computer 104. The managed computer 104
includes a removable component 105 and an expansion slot 107 which can
accept an additional component.
A managing application 106 software module runs on the managing computer
102. The managing application 106 maintains grouped hot plug data 108
corresponding to characteristics, capabilities or states of components of
the managed computer 104. Periodically, the managing application 106 sends
a request over the network 100, formatted according to communication
protocol 110, to the managed computer 104 seeking current hot plug data.
A management agent 112 software module running on the managed computer 104
receives requests for hot plug data from the network 100. One or more hot
plug data acquisition routines in the management agent 112 poll or query
various components of the managed computer 104 and interact with device
drivers installed on the managed computer 104 to acquire grouped hot plug
data 114. The management agent 112 sends the requested current hot plug
data values, formatted in accordance with the communication protocol 110,
to the managing computer 102. The managing application 106 receives
current hot plug data values from the managed computer 104 over the
network 100.
A network administrator uses the managing application 106 to control and
carry out hot plug (hot add or hot swap) operations performed on the
managed computer 104. In preparing to perform a hot plug operation, the
network administrator invokes a component browse routine of the managing
application 106. The component browse routine extracts data from the
grouped hot plug data 108 to display on a computer screen of the managing
computer 102 a representation of the removable component 105 and the
expansion slot 107 of the managed computer 104.
To perform a hot add operation, the network administrator selects (using a
mouse pointer, for example) the displayed representation of the expansion
slot 107 of the managed computer 104 and invokes a hot add routine of the
managing application 106. The hot add routine accesses the grouped hot
plug data 108 to obtain information about the expansion slot 107 such as,
for example, its power state (whether power is currently being supplied to
the expansion slot 107) and the location of the expansion slot 107 (e.g.,
one or more identifiers permitting software on the managed computer 104 to
unambiguously direct commands to the expansion slot 107).
The hot add routine of the managing application 106 generates and sends, if
needed, a request to the managed computer 104 to shut down power to the
expansion slot 107. The management agent 112 receive the request over the
network 100, shuts down power to the expansion slot 107, verifies the
successful completion of the operation and sends response information over
the network 100 to the managing computer 102 indicating the successful
completion of the requested operation.
The managing application 106 receives the successful completion information
and prompts the network administrator to add a component to the managed
computer 104. The network administrator then physically inserts a
component into the expansion slot 107 and provides input to the managing
application 106 indicating that a component has been added to the managed
computer 104.
The hot add routine of the managing application 106 uses the grouped hot
plug data 108 concerning the expansion slot 107 to generate and send a
request to the managed computer 104 to restore power to the expansion slot
107. The management agent 112 receives the request, performs it and sends
information to the managing computer 102 indicating that the request
completed successfully.
The network administrator may verify operation of the component added to
the management computer 104 by generating and sending requests to the
managed computer 104 to query, poll or examine the added component and
obtain hot plug data concerning the added component. The management agent
112 receives such requests and polls or queries the added component to
obtain information such as, for example, a component name, driver number,
vendor ID, device ID, operational state, and whether the component can be
hot swapped (removed while the computer is still running). The management
agent 112 stores the acquired data concerning the added component in the
grouped hot plug data on the managed computer 104 and also sends the
acquired data to the managing computer 102 over the network 100.
To perform a hot swap operation, the network administrator determines that
the removable component 105 of the managed computer 104 should be replaced
with another component. The network administrator's determination may be
based on an alert condition that the removable component 105 is
malfunctioning. The alert condition may be monitored by the managed
computer 104 and communicated by the managed computer 104 to the managing
computer 102. To remove and replace the removable component 105, the
network administrator invokes a hot swap routine.
The hot swap routine accesses the grouped hot plug data 108 to obtain
information about the removable component 105 such as, for example, which
I/O expansion slot it may occupy, the power state of that expansion slot,
and any device driver (or other software module interfacing with the
removable component 105). The hot swap routine of the managing application
106 generates and sends a request to the managed computer 104 to suspend
operation of the device driver servicing the removable component 105.
The management agent 112 receives the suspend request over the network 100
and issues a call to suspend the target device driver. The management
agent 112 sends a response to the managing computer 102 that the device
driver has been successfully suspended. The managing application 106
receives the successful suspension response and, as in the hot add
operation, generated and sends a request to the managed computer 104 to
shut off power to the I/O expansion slot holding the removable component
105. The management agent 112 receives the request to shut off power,
performs the request, and sends a successful completion response to the
managing computer 102.
The managing application 106 prompts the network administrator to remove
the removable component 105 and replace it. The network administrator
removes the removable component 105, replaces it with another component.
If the new component requires a device driver different than the one that
serviced the removable component 105, the network administrator loads
(and, if needed, configures) the required device driver. The network
administrator provides input to the managing application 106 indicating
that the removable component 105 has been removed and replaced with
another component.
The managing application 106 generates and sends a request to the managed
computer 104 to restore power to the I/O expansion slot holding the
replacement component. The management agent 112 receives the request to
restore power, restores the power to the target I/O expansion slot, and
sends a successful completion response to the managing computer 102. The
managing application 106 generates and sends a request to the managed
computer 104 to resume the operation of the suspended driver. The
management agent 112 receives the request to resume driver operation,
issues a call to the driver to resume its operation (begin servicing the
replacement component), and sends to the managing computer 102 a
successful completion response.
The managing application 106 requests updated hot plug data from the
managed computer 104. The management agent 112 receives requests for
updated hot plug data, polls and queries components of the managed
computer 104 as needed, and returns (over the network 100) the updated hot
plug data to the managing computer 102. The managing application 106
examines the updated hot plug data 108 to verify the normal operation of
the replacement component.
In accordance with one embodiment of the present invention, the groups of
hot plug data values exist as groups of variables, where each variable has
a name or identifier, associated attributes (which may include a data
type, a description and status information) and associated operations
(e.g., read or write) that can be performed on the variable. Such
variables may also be known as managed objects. In this embodiment, a
network administrator uses management software to access groups of
variables to control hot plug operations performed on managed server
computers.
FIG. 2 illustrates a computer network 200 having a client computer 202 and
a server computer 204. In one embodiment, the client computer 202 and the
server computer 204 run Windows NT.TM. operating system software and use
one or more Intel Pentium.TM. processors to process instructions. It will
be understood by those of ordinary skill in the art that the invention is
not limited by an operating system or a type of processor. More
particularly, the server computer 204 may be a NetFRAME ClusterServer 9000
(NF9000) series server computer provided by NetFRAME, Inc. of Milpitas,
Calif.
In one embodiment, the client computer 202 and server computer 204 use a
communication protocol 206 defined according to SNMP (Simple Network
Management Protocol) to communicate with each other over a Windows
NT-based network 200. The invention, however, is not limited by any
particular network or network protocol. SNMP is well known in the art as a
collection of standards and protocols for managing network-based devices.
SNMP provides guidelines for requesting, obtaining and transporting
management information between network components. (See, for example,
Marshall T. Rose, The Simple Book (2d ed. 1994)). SNMP became an Internet
Standard in 1990, when it was published as RFC1157 (Request For Comments
No. 1157). Since then, the SNMP standards and protocols have been updated
to support graphical user interfaces, and, so updated, the standards and
protocols are referred to as WinSNMP. One embodiment of the present
invention uses a WinSNMP product (including protocol stack and libraries)
from ACE*COMM of Gaithersburg, Md., which implements WinSNMP in
cooperation with Windows NT.
Management software called Maestro 208 runs on the client computer 202.
Maestro 208 controls hot plug operations performed on the server computer
204. An SNMP agent 209 runs on the server computer 204. SNMP agents are
known in the art. SNMP agents respond to SNMP requests (e.g., SNMP.sub.--
GET, SNMP.sub.-- GETNEXT, and SNMP.sub.-- SET) received from management
software, such as Maestro 208.
An SNMP extension agent 210 also runs on the server computer 204. The SNMP
extension agent 210 also responds to SNMP requests (e.g., SNMP.sub.-- GET,
SNMP.sub.-- GETNEXT, and SNMP.sub.-- SET) received from Maestro 208, but
only responds to requests not processed by the SNMP agent 209. Those of
ordinary skill in the art will understand generally that SNMP agents may
not process certain SNMP requests and, rather, may pass such requests to
an SNMP extension agent.
In response to requests generated and sent by Maestro 208 over the network
200, the SNMP extension agent 210 assists in the performance of hot plug
operations by acquiring hot plug variable information concerning
components of the server computer 204, by sending the acquired hot plug
variable information to Maestro 208, and by directing commands to any
components of the server computer 204 involved in a hot plug operation as
requested by Maestro 208.
The SNMP extension agent 210 communicates with a device driver called
configuration manager 218 to acquire information about adapters or device
drivers in the server computer 204 or to direct commands or requests to
adapter-related components (e.g., device drivers servicing the adapters
such as SCSI drivers or ethernet drivers). The SNMP extension agent 210
also communicates with a device driver called Intrapulse manager 220 to
acquire information about I/O expansion slots, canisters (removable
devices having multiple I/O expansion slots), power supplies and system
boards of the server computer 204 or to direct commands or requests to
related components such as associated drivers.
In another embodiment, both Maestro 208 and the SNMP extension agent 210
run on the server computer 204. In such an embodiment, a network
administrator may invoke Maestro 208 as well as provide input to Maestro
208 from a client computer 202.
Maestro 208 maintains, in a computer readable medium such as random access
memory (RAM), hot plug MIB (management information base) variable data 212
relating to characteristics, capabilities or states of components of the
server computer 204 which may be involved in a hot plug operation. A MIB
generally represents network, device, component and other information in a
tree-like framework, specifying hierarchical relationships between the
networks, devices, components and other elements.
A hot plug MIB defines a set of hot plug MIB variables and groups those hot
plug variables according to the components that are managed during hot
plug operations. MIBs are known in the art and are commonly used to define
variables for use in managing network-based components or devices using
SNMP standards and protocols. (See, for example, Dr. Sidnie Feit, SNMP A
Guide to Network Management (1995); Mark A. Miller, Managing Internetworks
with SNMP (1993)).
The SNMP extension agent also maintains, in a computer readable medium such
as RAM, hot plug MIB variable data 214 relating to characteristics,
capabilities or states of components of the server computer 204. While the
set of hot plug MIB variables 212 maintained by Maestro may be the same as
the set of hot plug MIB variables 214 maintained by the SNMP extension
agent 210, the values of the variables may differ as those maintained on
the server computer 204 may have been obtained more recently. In one
embodiment, the hot plug MIB variable data 212 on the client computer 202
and the hot plug MIB variable data 214 on the server computer 204 are
organized as tables in respective RAM.
FIG. 3 represents portions of the hierarchical levels of a MIB. The nodes
comprising upper levels of a MIB hierarchy 302 are fixed by the SNMP
standard, including nodes corresponding to standards setting organizations
down through industry-specific organizations down through particular
networks and further down through private networks and to individual
enterprises. Lower levels, and in particular levels below the "enterprise"
node 304, are not defined by the SNMP and may be defined as needed. A
portion of a MIB including the levels below the "enterprise" element 304
may be referred to as a subMIB 306.
According to the SNMP standard, the nodes comprising each hierarchical
level of a MIB are assigned unique integer values in addition to names.
Thus, a target node occupying a lower level in a MIB may be identified by
listing in series separated by a "." the number assigned to each node
encountered in a downward traversal of the MIB beginning with the highest
(or root level) node. Thus, the "enterprise" node 304 may be identified as
1.3.6.4.1 (iso (1), org (3), dod (6), private (4), enterprise (1)).
The subMIB 306 extends the SNMP standard MIB, the extension defining groups
of variables that support management operations performed on a server
computer, in this case an NF9000 ClusterServer series server computer by
NetFRAME, Inc. On the level immediately below the "enterprise" node 304, a
"netframe" node 308 represents the highest level of the subMIB 306
extending the SNMP standard MIB. Nodes immediately below the "enterprise"
node 304 may be requested from and registered by the Internet Assigned
Numbers Authority (IANA) which assigns an integer value to a requested
node and registers the name and assigned value.
In FIG. 3, the subMIB 306 defines a "nf9000Mgmt" node 310 at the level
immediately below the "netframe" node 308. On the level immediately
following the "nf9000Mgmt" node 310, the subMIB 306 defines group nodes
corresponding to management related components of the server computer 204:
a trapGroup node 312, an adapterGroup node 314, a driverGroup node 316, a
slotGroup node 318, a canisterGroup node 320, a systemGroup node 321, a
coolingGroup node 322, a powerGroup node 324, a cpuGroup node 326, and a
slotFanGroup node 328. Each group node includes MIB variables
corresponding to a component of the server computer 204.
MIBs are formally described using an abstract syntax notation set out in
ISO 8824. A MIB defining the nodes and group nodes of the subMIB 306 and
also defining the MIB variables for each group node follows:
##SPC1##
MIBs thus specify variables hierarchically in groups, each group generally
corresponding to a managed, network-based component that may be subject to
software control or interrogation. Descriptions of exemplary groups of MIB
variables follow:
The Trap Group
The trap group includes MIB variables corresponding to alerts to the
network administrator. MIB Variables of the trap group facilitate alerts
monitored by one or more software modules running on a server computer
204. Monitored alerts include, for example, CPU failure, fan failure,
temperature warning, power supply failure, insertion/extraction of power
supplies and canisters, adapter malfunctions, and I/O slot power changes.
The Adapter Group
The adapter group includes MIB variables representing adapters in a server
computer 204. Each I/O card may have one or more adapters or multifunction
cards. The adapters represented include those that support hot swap. The
adapter group includes variables to control the state of an adapter to
permit hot swap or hot add operations. Variables corresponding to
configuration information for identifying each adapter are also included
as are variables providing links to the slot and driver groups.
The Driver Group
The driver group includes MIB variables representing the device drivers
associated with adapters. The MIB variables in the driver group represent
the name and version of the device drivers. This group may also include
variables identifying the location of loadable device driver files,
variables identifying instructions for loading device drivers, and
variables identifying instructions for configuring device drivers.
The Slot Group
The slot group includes MIB variables representing I/O expansion slots in a
server computer 204. The MIB variables in the slot group represent the
capability of various expansion slots to be powered off or on either
individually or on a per canister basis (a canister is a device containing
multiple expansion slots). Information on the availability (whether in a
canister or currently occupied by a card) and power state of expansion
slots is also represented by the MIB variables of this group.
The Canister Group
The canister group includes MIB variables representing canisters and their
components. Each canister has multiple expansion slots. The canister group
MIB variables represent information on each canister, including the
respective name, serial number, revision number, type, power state,
location, and fans.
The System Group
The system group includes MIB variables representing information related to
the server computer 204. For example, serial numbers, description, and
revision data for hardware components are represented by MIB variables in
this group, as well as information about memory, system time and date.
The Cooling Group
The cooling group includes MIB variables representing the devices used to
cool the server computer's 204 components. The MIB variables in this group
include information about system board fans and temperature sensors.
The Power Group
The MIB variables in the power group represent information about power
supplies (e.g., serial number, DC state, AC state, and location), physical
power switches, the reset button, and voltage sensors on the system board.
The CPU Group
The cpu group includes MIB variables representing information about system
CPUs such as clock frequency, bus-to-core ratio, as well as CPU state.
The Slot Fan Group
The MIB variables in the slot fan group represent information about I/O
expansion slot cooling fans.
Software modules such as Maestro 208 and the SNMP extension agent 210
incorporate groups of MIB variables for use in managing network-based
components. That incorporation process generally begins with designing a
MIB and then compiling it to produce a binary MIB file.
FIG. 4 represents a block diagram illustrating the process of creating and
distributing a binary MIB file. After identifying a set of variables
corresponding to characteristics, capabilities or states of network-based
components which might need to be accessed, acquired or altered to perform
desired management operations, a MIB designer generally prepares a MIB
definition file 402 containing a formal, syntactically correct MIB
specification (sometimes referred to as a MIB module). The MIB
specification defines various attributes for each variable according to
the ISO 8824 standard which provides generally that each variable have an
OBJECT-TYPE identifier (or variable name), a SYNTAX specification (or data
type such as integer, string, etc.), and an ACCESS descriptor (e.g.,
whether the variable may be written or is read-only).
The MIB designer inputs the MIB definition file (typically an ASCII text
file) into a MIB compiler 404. MIB compilers are known in the art.
Products such as IBM's NetView/6000.TM. or Hewlett-Packard's Openview
Network Node Manager.TM. provide MIB compilers. The MIB compiler 404 reads
and processes the ASCII MIB definition file 402 and produces a binary MIB
file 406 as output. In one embodiment, the binary MIB file is called a
"MIB.INI" file. A network administrator copies the binary MIB file 406 to
local computer storage media such as a hard disk drive of each client
computer 202 and server computer 204 attached to the network 200.
After one or more groups of MIB variables are defined (at least given names
and data type specifications), a programmer may design a software module
like Maestro 208 which can use the defined MIB variables to manage
network-based components by forming SNMP requests including the MIB
variables and sending those requests to managed components.
A programmer may also design a software module to be responsive to SNMP
requests that are based on certain defined MIB variables. Generally, SNMP
agent modules respond to SNMP requests using MIB variables fixed by
standard. Software modules responsive to SNMP requests based on newly
defined MIB variables are sometimes called SNMP extension agents which
cooperate with an SNMP agent, the SNMP extension agent responding only to
SNMP requests not processed by the SNMP agent.
Management software modules, like Maestro 208, generate and send SNMP
requests, and agent software modules, like the SNMP extension agent 210,
receive and respond to SNMP requests. The format of an SNMP request
includes a request (or message) type (i.e., GET.sub.-- REQUEST, GET.sub.--
NEXT.sub.-- REQUEST, SET.sub.-- REQUEST) and a variable/value pair.
The variable in the variable/value pair corresponds to a MIB variable
(which may be called an object identifier). The value in the
variable/value pair represents storage space for a data value which may be
ignored or may contain a value represented by the variable. In the case of
a request of type GET.sub.-- REQUEST or GET.sub.-- NEXT.sub.-- REQUEST, a
management software module generally seeks information from a managed
computer (or component or device), the specific information identified by
the variable, and the value may be ignored when such a request is
transmitted. The managed computer will ascertain the information sought,
store the discovered information in the storage space for the value and
send the response to the management software module.
In one embodiment, an example of a variable/value pair for an SNMP request
of type GET.sub.-- REQUEST is (1.3.6.1.4.1.837.2.4.1.1.6.2.1, 0). In this
example, the variable represents the following node pathway: iso, org,
dod, internet, private, enterprise, netframe, nf9000mgmt, slotGroup,
slotTable, slotTableEntry, slotPowerState, slotGroupNumber2, slotNumber1.
The value (the second member of the variable/value pair) is 0 on
transmission, but the management software may expect the value to be
supplied by an SNMP agent module running on the managed computer which
sends to the management software a response message having a format
similar to that of the request. In one embodiment, a management software
module, like Maestro 208, may send the example SNMP request to a server
computer to request the power state (whether power is supplied or not) of
the first I/O expansion slot in the second group (or canister) of I/O
expansion slots. In a corresponding response message, the same variable
may be sent, along with a value=1 (e.g. power to the slot is ON) by an
SNMP agent module to the management software module.
In the case of a request of type SET.sub.-- REQUEST, the value represents
an actual value to be assigned to some component or device indicated by
the variable. In one embodiment, an example of a variable/value pair for
an SNMP request of type SET.sub.-- REQUEST is
(1.3.6.1.4.1.837.2.4.1.1.6.2.1, 1). A management software module like
Maestro 208 might send this example SNMP request to cause power to be
supplied to the first expansion slot of the second expansion slot group.
The value 1 corresponds to power state ON, and the request type,
SET.sub.-- REQUEST, indicates to receiving agent software, such as the
SNMP extension agent 210, that the component identified by the variable is
to be set according to the value (e.g., 1). Accordingly, the agent
software takes steps to power the identified slot.
Generally, a programmer designs software functions or routines which
request MIB variable information, or which use or acquire MIB variable
information in communicating directly with hardware components, with
system services, or with drivers. Those of ordinary skill in the art will
appreciate that, in one embodiment, the MIB variables represented in a
binary MIB file are exposed to programmer-designed modules or routines and
that the routines are registered with an SNMP service.
Software modules designed and developed to use MIB variables can be written
in a wide variety of programming languages, such as C, C++, or Pascal, and
such software modules may or may not be arranged in an object-oriented
format. The present invention is not limited by a programming language.
Some MIB compilers assist the development of software modules using MIB
variables by automatically generating skeletal outlines of software code,
including data declarations corresponding to MIB variables, that will be
needed to set or obtain values in a managed device. Those of ordinary
skill will appreciate that, in one embodiment, an SNMP extension agent may
be designed to respond to requests concerning particular MIB variable by
generating, in C programming language terms, a large switch block which
simply associates groups of programming language instructions with case
statements corresponding to MIB variables. In this embodiment, program
execution may efficiently branch based simply on MIB variables.
In the embodiment illustrated in FIG. 2, a MIB extended to define groups of
variables for hot plug operations (a hot plug MIB) is compiled to generate
a binary hot plug MIB file. The hot plug MIB includes hot plug group
nodes: trapGroup node 312, adapterGroup node 314, driverGroup node 316,
slotGroup node 318, and canisterGroup node 320. The binary hot plug MIB
file exposes to Maestro 208 and to the SNMP extension agent 210 the
variables within each hot plug group. The variables defined in the hot
plug groups are described below.
Trap Information
While some hot plug operations are routine and preventive in nature, other
hot plug operations are performed in response to failures or pending
failures of components of the server computer 204. If and when alerted
that a failure has occurred or is pending, a network administrator can
quickly take action to learn the precise cause of the failure and then
remedy the failure. The hot plug MIB variables defined in the trapGroup
312 provide information useful in identifying faulty components of a
server computer 204.
The following diagram represents the hot plug MIB variables defined in the
trapGroup 312.
______________________________________
trapGroup
______________________________________
1 trapCpu
2 trapSystemBoardFan
3 trapTemperature
4 trapPowerSupply
5 trapCanister
6 trapAdapter
7 trapSlotFan
8 trapCanisterFan
______________________________________
The trapCpu variable reports the number of a failed CPU. This number can be
used to index another variable (e.g., a cpuTable) to retrieve more
information on the failed CPU.
The trapSystemBoardFan variable reports the number of a failed system Board
fan, indicating that a fan speed dropped below the minimum limit allowed.
The fan number represented by this variable can be used to index another
variable (e.g. a coolingFanTable variable) to retrieve more information on
the failed fan.
The trapTemperature variable reports the number of a temperature sensor
that detected a "normal" to "warning" transition (i.e., the temperature
raised above the threshold "warning" level which, in one embodiment, is
defined by the variable coolingAlertTemperature in the Cooling Group).
The trapPowerSupply variable reports the number of a power supply that has
been extracted/inserted or that has detected an AC/DC failure. In one
embodiment, this number can be used to index another variable (e.g., a
powerSuppyTable variable) to retrieve more information on the power supply
that caused this trap.
The trapCanister variable reports the name of a canister that has been
either extracted or inserted. The trapAdapter variable reports the number
of an adapter that is malfunctioning. In one embodiment, this number can
be used to index another variable (e.g., an adapterTable variable) to
retrieve more information on the related adapter.
The trapSlotFan variable reports the number of an I/O slot fan that failed
(i.e., dropped below a threshold minimum speed). The fan number
represented by this variable can be used to index another variable (e.g.,
a slotFanTable variable) to retrieve more information on the failed fan.
The trapCanisterFan variable reports the name of a canister whose cooling
system failed (i.e., the speed of at least one of the canister's fans
dropped below the threshold minimum speed).
Adapter Information
In one embodiment, a hot plug MIB includes an adapterGroup node 314 which
defines and groups variables used for performing hot plug operations
involving adapters of the server computer 204. Certain information may be
very useful in determining whether and how to perform hot plug operations
on adapters.
For example, not all adapters support hot plug procedures. Thus, before
performing a hot plug procedure on a particular adapter, it may be useful
to determine whether the adapter supports a hot plug procedure. It may
also be useful to determine the state of an adapter (e.g., failed,
malfunctioning, working normally, or suspended) and also to identify the
commands an adapter responds to (e.g., suspend, resume, reset, etc.).
Also, identifying the physical bus connected to an adapter and which
device driver is servicing an adapter may expedite hot plug operations on
an adapter. Hot plug MIB variables of the adapter group conveniently
represent that and other information and make such information easily
accessible to software modules such as Maestro 208 and the SNMP extension
agent 210 which control hot plug operations on adapters of a server
computer 204.
The following diagram represents the hot plug MIB variables defined in the
adapterGroup 314.
______________________________________
adapterGroup
______________________________________
1 adapterTable
1 adapterTableEntry
1 adapterNumber
2 adapterName
3 adapterSupportsHotSwapHotAdd
4 adapterState
5 adapterCommand
6 adapterDriverNumber
7 adapterBusNumber
8 adapterDeviceNumber
9 adapterFunctionNumber
10 adapterVendorId
11 adapterDeviceId
12 adapterRevisionId
13 adapterBaseClass
14 adapterSubClass
15 adapterProgrammingInterface
______________________________________
The adapterTable variable describes all adapters in the server computer
204, while the adapterTableEntry variable describes each adapter. The
adapterNumber variable provides a unique logical number of a particular
adapter, and in each case may be an integer greater than one which serves
as an index into the adapterTable.
The adapterName variable specifies a unique name of an adapter in a ASCII
character string format (the string would be empty when a name is not
available). The adapterSupportsHotSwapHotAdd is an integer variable
indicating whether a particular adapter supports (if value=1) or does not
support (if value=2) hot swap or hot add operations.
The adapterState variable, another integer variable, indicates the current
state of a particular adapter (1=unclaimed (no driver loaded for the
adapter), 2=unknown (adapter not responsive or does not support hot plug
operations, in either case no commands should be sent to the adapter),
3=failed (driver or adapter is malfunctioning), 4=active (driver and
adapter are working normally), 5=suspended (the adapter's operation has
been suspended)).
The adapterCommand variable, also an integer, indicates the commands that
may be sent to an adapter (1=reset (causes adapter to re-initialize and
resume operations), 2=suspend (suspends operations of the adapter in
preparation for hot add or hot swap), 3=forcesuspend (forcefully suspends
operations of the adapter in preparation for hot add or hot swap),
4=resume (causes adapter to resume operations)).
The adapterDriverNumber variable indicates the logical number of the driver
handling a particular adapter, and may be an integer useful for indexing
into the driverTable to retrieve attributes of a driver. The
adapterBusNumber indicates a bus number for a particular adapter and, in
one embodiment, represents a physical PCI number.
The adapterDeviceNumber variable indicates a device number of a particular
adapter and may, in one embodiment, represent a physical PCI device
number. The adapterFunctionNumber variable provides a function number of a
particular adapter and may represent a PCI function number in one
embodiment. The adapterVendorID variable indicates a vendor identification
code for a particular adapter which may comply with a PCI vendor
identification format. The adapterDeviceID variable indicates a device
identification code for a particular adapter which, in one embodiment, may
represent a PCI device identification code.
The adapterRevisionID variable indicates a revision identification code for
a particular adapter and, in one embodiment, may represent a PCI revision
identification code. The adapterBaseClass variable identifies a Base Class
code for a particular adapter and may, in one embodiment be a PCI Base
Class code. The adapterSubClass variable identifies a Sub Class code for a
particular adapter and may represent a PCI Sub Class code in one
embodiment. The adapterProgrammingInterface variable represents a
Programming Interface code for a particular adapter and, in one
embodiment, represents a PCI programming Interface code.
It will be readily appreciated by those of ordinary skill in the art that
differently named MIB variables having different value ranges may be used
to represent information useful for performing hot plug operations on
adapters in a server computer.
Driver Information
Device drivers govern the exchange of information (provide an interface)
between some hardware components of a server computer and software modules
running on the server computer. Device drivers may also poll (test or
examine values or characteristics of) hardware, send commands to hardware,
or set parameters affecting operation of hardware. In one embodiment,
device drivers may suspend or resume their interaction with a hardware
component and may be requested to do so during hot plug operations.
Suspend and resume states of device drivers are described in greater
detail in the U.S. patent application entitled "HOT ADD OF DEVICES
SOFTWARE ARCHITECTURE" invented by Mehrdad Khalili, Mallik Arjunan
Mahalingam, John Reed, and Walter Wallach, filed along with the present
application on Oct. 1, 1997. Hot plug MIB variables in the driverGroup 316
include information useful for identifying a driver servicing a particular
hardware component (e.g., an adapter in one embodiment).
The following diagram represents the hot plug MIB variables defined in the
driverGroup 316 representing information about drivers useful in
performing hot plug operations.
______________________________________
driverGroup
______________________________________
1 driverTable
1 driverTableEntry
1 driverNumber
2 driverName
3 driverVersion
______________________________________
The driverTable variable describes attributes of device drivers servicing
components which may be involved in hot plug operations performed on a
server computer. The driverTable variable may be populated differently
depending on an operating system. For example, under the Windows NT.TM.
operating system, the table may include information for drivers servicing
hardware components (e.g., adapters) which support hot plug operations.
The driverTableEntry variable contains attributes of particular device
drivers, including driverNumber, driverName, and driverVersion. The
driverNumber variable represents the unique logical number of a driver
and, in one embodiment represents an integer value which can be used as an
index into the driverTable variable. The driverName variable is an ASCII
character string representing a unique name of a driver. The driverVersion
variable identifies the version of a driver in the form of an ASCII
string. Because different versions of device drivers may behave
differently, it may be useful to identify the version number of a device
driver servicing a hardware component (e.g., an adapter) to adjust
interaction with the driver. It will be understood by those of ordinary
skill that differently named MIB variables with different ranges of values
may represent driver information useful to perform hot plug operations.
Slot Information
To perform hot plug operations on components (such as adapters) interacting
with I/O expansion slots, various information about the expansion slots
may be useful. For example, software modules such as Maestro 208 or the
SNMP extension agent 210 which control hot plug operations related to
expansion slots may favorably use information indicating whether an
expansion slot is occupied (e.g., an adapter is inserted into the slot),
whether power is being supplied to a slot, what value (numeric or
otherwise) should be used to identify a slot, and which bus operations may
be suspended during a hot plug operation involving a particular slot. Hot
plug MIB variables in the slotGroup 318 represent that and other
information about I/O expansion slots useful in performing hot plug
operations on devices interacting with the expansion slots.
The following diagram represents the hot plug MIB variables defined in the
slotGroup 318 representing information about I/O expansion slots useful in
performing hot plug operations.
______________________________________
slotGroup
______________________________________
1 slotTable
1 SlotTableEntry
1 slotGroupNumber
2 slotNumber
3 slotBusNumber
4 slotDeviceNumber
5 slotAdapterPresence
6 slotPowerState
7 slotLocation
______________________________________
The slotTable variable describes all I/O expansion slots in a server
computer 204. The slotTableEntry variable describes particular I/O
expansion slots. In one embodiment, the slotTable defines information
about PCI slots. A PCI slot may be identified by paired values
corresponding to bus number and device number.
The slotGroupNumber variable (the first entry of the slotTableEntry
variable) represents a group of slots to which a single slot may belong.
In one embodiment, a server computer 204 has only one group of I/O
expansion slots, and thus the slotGroupNumber may have a value of one. In
another embodiment, a server computer may have groups of slots where each
group corresponds to a canister and each canister has multiple (e.g., 4)
I/O expansion slots. In such an embodiment, the slotGroupNumber identifies
the canister (or group of I/O expansion slots) to which a slot belongs.
The slotGroupNumber variable, in one embodiment, can also be used as an
index into the canisterTable (described below).
The slotNumber variable indicates a logical slot number identifying a slot
within a group of slots (e.g. a value of 3 may correspond to the third
slot in a four-slot canister). The slotBusNumber variable represents a
value identifying the bus on which a particular slot resides and, in one
embodiment, may represent a physical PCI bus number. The slotDeviceNumber
variable represents the device number of a particular slot and, in one
embodiment, may represent the physical PCI device number.
The slotAdapterPresence variable indicates whether an adapter is present in
a particular slot and, in one embodiment, the slotAdapterPresent variable
is an integer (e.g., 1=card present in slot, 2=no card present in slot).
The slotPowerState variable indicates whether a particular slot has power.
In one embodiment the slotPowerState variable is an integer (e.g., 1=power
to slot is on, 2=power to slot is off, 3=unavailable (used for example
when slots are grouped in canisters and the canisters, rather than the
individual slots, are optionally powered)).
The slotLocation variable indicates the physical location of a particular
I/O expansion slot, in one embodiment according to paired integer values
corresponding respectively to slot group number and logical slot number
(e.g., (2,4) identifies the fourth slot in the second group of slots). One
of ordinary skill in the art will recognize that differently named MIB
variables having ranges of values differing from those described above may
represent information about I/O expansion slots useful in performing hot
plug operations.
Canister Information
The canisterGroup node 320 defines hot plug MIB variables representing
information useful in performing hot plug operations on server computers
which organize I/O expansion slots into canisters such that each canister
contains multiple I/O expansion slots. The canisterGroup 320 hot plug MIB
variables include information describing, for example, the maximum number
of canisters in a server computer, the name of a canister (corresponding
to a name which may visibly and readably appear on the canister), the
power state of the canister (which generally supplies power to all
hardware components, such as adapters, occupying slots of the canister),
and the physical location of the canister.
The following diagram represents the hot plug MIB variables defined in the
canisterGroup 320 representing information about canisters holding
multiple I/O expansion slots, such information being useful in performing
hot plug operations involving hardware components connected to the
expansion slots.
______________________________________
canisterGroup
______________________________________
1 canisterMaximumNumberOfCanisters
2 canisterTable
1 canisterTableEntry
1 canisterNumber
2 canisterName
3 canisterSerialNumber
4 canisterRevisionInfo
5 canisterDescription
6 canisterPowerState
7 canisterLocation
8 canisterFanMinSpeed
9 canisterFanSpeedSetting
10 canisterFan1Speed
11 canisterFan1Fault
12 canisterFan2Speed
13 canisterFan2Fault
______________________________________
The canisterMaximumNumberOfCanisters variable indicates the maximum number
of canisters supported by a system. In one embodiment, a server computer
can hold at most four canisters while, in another embodiment, a server
computer has no (0) canisters.
The canisterTable variable describes the attributes of all the canisters in
the system, while the canisterTableEntry variable describes a particular
canister. The canisterNumber variable represents a unique number (e.g., an
integer) identifying a canister. The canisterName variable comprises an
ASCII character string identifying a canister (e.g., "A" or "B"). The
canisterSerialNumber variable represents the serial number of a canister.
The canisterRevisionInformation variable indicates the revision number and
date of the canister's controller. The canisterDescription variable
provides an ASCII character string containing a short description of a
canister. The canisterPowerState variable indicates whether the canister
has power (e.g., an integer value where 1=canister has power, 2=canister
has no power).
The canisterLocation variable describes the physical location of a
canister. For example, in one embodiment wherein the canisters are located
in quadrants, "A" denotes the top left canister location, "B" denotes the
top right canister location, "C" denotes the bottom left canister
location, and "D" denotes the bottom right canister location.
The canisterFanMinSpeed variable represents a minimum fan speed in
revolutions per second below which a fault condition occurs. The
canisterFanSpeedSetting variable indicates the current fan speed in
revolutions per second. In one embodiment, the canisterFanSpeedSetting is
an integer where 1=low speed and 2=high speed.
In an embodiment where each canister of a server computer contains two
cooling fans, the canisterFanOneSpeed variable represents the current
speed of fan number one in revolutions per second. The canisterFanOneFault
variable indicates the current state of fan number one (e.g., 1=fan
faulted, 2=fan OK). Like the canisterFanOneSpeed variable, the
canisterFanTwoSpeed variable represents the current speed of fan number
two in revolutions per second. The canisterFanTwoFault variable indicates
whether fan number two is operating or has faulted. One of ordinary skill
will understand that hot plug MIB variables which represent information
about canisters useful in hot plug operations may have names or value
ranges different r=from the canister hot plug MIB variables defined above.
Moreover, one of ordinary skill will recognize that many techniques exist
for defining and grouping variables and that the present invention is not
limited by a MIB or the MIB format.
FIG. 5 illustrates steps to create hot plug MIB variables 214 maintained by
the SNMP extension agent 210. In a first step 502, the server computer 204
starts. The server computer 204 is typically started (powered on) by a
network administrator.
In another step 504, the operating system software running on the server
computer 204 loads device drivers, including configuration manager 218 and
Intrapulse manager 220. In a further step 506, SNMP service starts. In a
still further step 508, SNMP agent 209 and SNMP extension agent 210 start.
In a step 510, the SNMP extension agent 210 confirms that the SNMP service
is running on the server computer 204. In the step 510, the SNMP extension
agent also confirms that the configuration manager 218 and Intrapulse
manager 220 are also running on the server computer 204. In one
embodiment, if any of the SNMP service, the configuration manager 218 or
the Intrapulse manager 220 are not running, the SNMP extension agent will
cause a message to be displayed warning a network administrator that the
SNMP extension agent 210 has detected that certain other software is not
running (the message may also indicate the name of the software module),
the message may also suggest that the network administrator load or start
a software module.
In a further step 512, the SNMP extension agent 210 creates data structures
corresponding to hot plug MIB variables 214. Initially, the data
structures are not populated with values corresponding to hot plug MIB
variables. The SNMP extension agent 210 calls the configuration manager
218 requesting adapter and driver data. The configuration manager which,
in one embodiment, communicates with the BIOS (basic input/output system)
of the server computer 204, polls the adapters in the server computer 204
and determines associated drivers and provides to the SNMP extension agent
210 data values relating to adapters and drivers in the server computer
204. In the step 512, the SNMP extension agent 210 loads the adapter and
driver values received from the configuration manager 218 into the data
structures created earlier.
The SNMP extension agent 210, in the step 512, calls the Intrapulse manager
220 requesting information about I/O expansion slots, canisters, power
supplies, and system boards. The Intrapulse manager 220 which, in one
embodiment, communicates with the BIOS of the server computer 204, polls
the slots, canisters, power supplies and system boards of the server
computer 204 and provides to the SNMP extension agent 210 data values
corresponding to each of those components. The SNMP extension agent 210
then loads the I/O expansion slot data, the canister data, the power
supply data, and the system board data into the data structures. Thus the
SNMP extension agent 210 acquires hot plug MIB variables 214 shortly after
the server computer 204 starts. One of ordinary skill will recognize that
the SNMP extension agent 210 may call device drivers differing from those
described above to request information related to hot plug MIB variables
and will also recognize that such device drivers may poll hardware
components differing from those described above. The present invention is
thus not limited by device drivers or polled hardware components.
Maestro 208 maintains hot plug MIB variables 212 by periodically executing
a retrieve.sub.-- data routine. In one embodiment, a software timer
provided by an operating system produces a timer event at a predetermined
time interval. Such software timers are known in the art and the present
invention is not limited by a software timer. The retrieve.sub.-- data
routine responds to the timer event by executing. Thus, the
retrieve.sub.-- data routine may execute at predetermined time intervals.
The retrieve.sub.-- data routine generates and sends SNMP requests (e.g.,
requests of type GET.sub.-- REQUEST or GET.sub.-- NEXT.sub.-- REQUEST
specifying particular hot plug MIB variables) over the network 200 to the
server computer 204. In one embodiment, each request generated and sent by
the retrieve.sub.-- data routine seeks one MIB variable value. The SNMP
extension agent 210 receives the requests, polls or queries the components
identified by MIB variables in the requests, and sends response messages
to the client computer 102. In one embodiment, each response message
includes the value of one MIB variable. Maestro 208 receives the requests
and updates the hot plug MIB variables 212.
By executing the retrieve.sub.-- data routine at predetermined time
intervals, Maestro 208 ensures that the hot plug MIB variables 212 are
current. One of ordinary skill in the art will appreciate, however, that
the degree to which the hot plug MIB variables 212 are current relates to
the time interval between executing the retrieve.sub.-- data routine. In
one embodiment, the time interval may be changed by a user.
In another embodiment, the retrieve.sub.-- data routine runs on the server
computer 204 and executes periodically to refresh the hot plug MIB
variables 214 stored in computer readable media on the server computer
204. Moreover, at periodic intervals, a broadcast.sub.-- hot.sub.-- plug
MIB.sub.-- data routine periodically executes on the server computer 204.
The broadcast.sub.-- hot.sub.-- plug.sub.-- MIB.sub.-- data routine causes
hot plug MIB variable data to be refreshed on client computers. In one
embodiment, SNMP agent software running on a server computer 204
periodically generates a TRAP.sub.-- Hot.sub.-- Plug.sub.-- Variables
message. In another embodiment, SNMP agent software generates a
TRAP.sub.-- Hot.sub.-- Plug.sub.-- Variable message in response to
configuration changes affecting the components of a server computer 204.
While TRAP requests typically correspond to alert conditions, the SNMP
agent software generates a TRAP.sub.-- Hot.sub.-- Plug.sub.-- Variable
message to signal a refresh event for hot plug MIB variables. In this
embodiment, the SNMP agent software sends the TRAP.sub.-- Hot.sub.--
Plug.sub.-- Variable message to one or more client computers. A client
computer receives the TRAP.sub.-- Hot.sub.-- Plug.sub.-- Variable request
and, in response, executes the retrieve.sub.-- data routine to refresh the
hot plug MIB variables 212.
In an alternative embodiment, a management software module runs on the
server computer 204 and issues an execute retrieve request to one or more
client computers, requesting that the client computer 202 execute a
retrieve.sub.-- data routine. An SNMP agent module on the client computer
receives the execute retrieve request and executes the retrieve.sub.--
data routine on the client computer. The retrieve.sub.-- data routine
retrieves and stores hot plug MIB variables 212 as described above. Those
of ordinary skill in the art will appreciate that other methods exist for
periodically broadcasting hot plug MIB variable data from a server
computer 204 to a client computer 202.
In still another embodiment, the network administrator directs Maestro 208
to execute the retrieve.sub.-- data routine. It will be appreciated by
those of ordinary skill that manual executions of the retrieve.sub.-- data
routine may be combined with timed execution of the retrieve.sub.-- data
routine.
FIG. 6 illustrates steps performed in hot adding an adapter card to an I/O
expansion slot of a server computer 204 which does not use canisters. In a
first step 602, Maestro 208 accesses I/O expansion slot information from
the hot plug MIB variables 212 and displays information identifying empty
expansion slots (i.e., expansion slots currently not holding an adapter
card or other device) to a network administrator on a computer monitor of
the client computer 202. The displayed expansion slot information
corresponds to the current configuration of the server computer 204.
FIG. 7 illustrates an I/O expansion slot arrangement and bus configuration
for the server computer 204. The server computer 204 has eight I/O
expansion slots 702. Each I/O expansion slot 702 connects to a slot bus
704 which connects through a bridge 705 to a main bus 706. A power switch
700 regulates voltage provided to each I/O expansion slot 702.
As shown in FIG. 7, one embodiment assigns to each I/O expansion slot 702 a
unique number from 1-8. Thus, each I/O expansion slot 702 is uniquely
identified to permit software, firmware, or a network administrator to
unambiguously direct commands or queries to a desired I/O expansion slot
702. The SNMP extension agent 210 may direct commands or queries to an I/O
expansion slot 702.
A software module, including the SNMP extension agent 210, may also control
the power switch 700 by issuing power down or power up commands to the
switch respectively causing the power switch 700 to start or stop
providing voltage to the I/O expansion slot 702. Alternatively, a manual
switch may control the power to each slot.
Each I/O expansion slot 702 may accept a card or adapter. The card or
adapter may comprise any type of computer adapter. In one embodiment, an
I/O expansion slot 702 accepts a device control card (e.g., a SCSI
adapter). In another embodiment, the I/O expansion slot 702 accepts a
network communication card (e.g., an ethernet adapter card).
The slot bus 704 channels the input and output of data to and from an
adapter card in the I/O expansion slot 702. In one embodiment, the slot
bus 704 is configured to operate under the PCI (peripheral component
interconnect) standard. However, the present invention is not limited by
any bus.
In a step 604 (See FIG. 6), the network administrator views the displayed
information regarding empty I/O expansion slots and selects an empty I/O
expansion slot 702 in which to hot add an adapter card. The network
administrator then directs Maestro 208 to initiate the hot add operation.
In a further step 606, Maestro 208 uses the slotPowerState MIB variable to
generate an SNMP request (or simply request) of the type SET.sub.--
REQUEST to shut off the power to the selected slot. In one embodiment, the
request includes the MIB variable of 1.3.6.1.4.1.837.2.4.1.1.6.1.3
(referencing the power state of the third I/O expansion slot of the first
(and only) group of eight slots) and the value of 2 (power state: OFF) to
request that power be shut off to the third I/O expansion slot of the
server computer 204. In the step 606, Maestro issues the request to the
server computer 204 over the network 200.
In a further step 608, the SNMP extension agent 210 receives the request to
shut off power to the selected I/O expansion slot. The SNMP extension
agent calls the Intrapulse manager 220 and passes to it a parameter
identifying the selected slot and a value corresponding to the desired
power state (i.e., shut off power). The Intrapulse manager 200 accesses
memory reserved for the selected slot (in one embodiment, reserved memory
is accessed using a fixed base address plus an offset indicated by the
passed identification parameter), and the Intrapulse manager 200 alters
the value of a register in the accessed memory area, the alteration
causing a power switch 700 connected to the selected I/O expansion slot to
shut off power to the slot. The Intrapulse manager 220 returns a
completion code to the SNMP extension agent 210 which, in turn, transmits
a completion response message to the client computer 202.
In still a further step 610, Maestro receives the completion response
message and prompts the network administrator (e.g., by displaying a
message on the monitor of the client computer) to physically install the
adapter in the selected I/O expansion slot of the server computer 204. In
a next step 612, the network administrator installs the adapter card in
the selected I/O expansion slot. The network administrator loads and
configures a device driver to service the added adapter if such a driver
is not already loaded and configured on the server computer 204. In
another embodiment, the MIB variables include information matching a
device driver to the added adapter as well as information to load and to
configure the device driver, thus permitting Maestro 208, rather than the
network administrator, to load and configure a device driver to service
the added adapter. The network administrator then directs Maestro 208 to
complete the hot add operation (e.g., by using a mouse to click a command
button displayed by Maestro 208 and labelled "OK" or "Next").
In a step 614, Maestro generates a request to restore power to the selected
slot, the request including, in one embodiment, the MIB variable
1.3.6.1.4.1.837.2.4.1.1.6.1.3 (again referencing the power state of the
third I/O expansion slot of the first group of eight slots) and the value
of 1 (power state: ON) to request that power be resumed to the third I/O
expansion slot of the server computer 204. Maestro 208 sends the request
over the network 200 to the server computer 204.
In another step 616, the SNMP extension agent 210 receives the request to
restore power to the I/O expansion slot and calls the Intrapulse manager
220 passing it parameters identifying the slot and the desired power
state. Intrapulse manager 220 alters the appropriate register, restoring
power to the slot. The SNMP extension agent 210 generates a completion
response message and sends it to the client computer 202.
In a step 618, Maestro 208 executes the retrieve.sub.-- data routine to
update the hot plug MIB variables. After the retrieve.sub.-- data routine
finishes, Maestro 208 examines the hot plug MIB variable adapterState (now
updated and stored locally in computer readable media of the client
computer 102) to verify that the added adapter and its driver are
operating normally. In one embodiment, the adapterState variable is
1.3.6.1.4.1.837.2.2.1.1.4.3 (referencing the state of the adapter and
driver for adapter number 3). The value corresponding to this variable may
indicate that the adapter and its device driver are operating normally
(e.g., value=4) whereupon Maestro 208 notifies the network administrator
that the hot add operation completed successfully. If the value
corresponding to the adapterState variable indicates other than successful
completion, Maestro 208 notifies the network administrator that the hot
add operation was not successful. The network administrator can take
remedial action, including performing a hot swap operation as described
below.
FIG. 8 illustrates steps performed in hot adding an adapter card to a
server computer 204 having canisters (i.e., in one embodiment, a canister
has multiple I/O expansion slots, the slots not capable of being powered
down individually, the canister is removable and power to the canister may
be set on or off by software or by physical manipulation). In a first step
802, Maestro 208 displays on a computer monitor of the client computer
202, viewed by a network administrator, I/O expansion slot and canister
information extracted from hot plug MIB variables 212.
FIG. 9 illustrates an I/O expansion slot arrangement and bus configuration
for a server computer having four canisters 904, each of which supports up
to 4 I/O expansion slots 902. Each slot 902 may accept a card or adapter.
A slot bus (not shown) connects each slot to a canister bus 906. The
canister bus 906 provides input and output for adapters in any of the four
I/O expansion slots 902 in the canister 904. The canister bus 906 and a
main bus 900 connect at a bridge 905. The bridge 905 distributes data from
the main bus 900 to each of the canister buses 906.
In one embodiment, the four canisters are named "A", "B", "C", and "D"
respectively, starting with "A" at the upper-left corner and proceeding in
a clockwise rotation. Each slot is numbered 1-4 respectively, moving from
top-to-bottom in a canister. The named canisters and numbered slots
provide unique identification for software modules to address the
canisters and slots. A power switch 908 regulates power to a canister 904,
each canister 904 having a power switch 908. The power switch 908 can be
controlled by software.
In a next step 804 (see FIG. 8), the network administrator selects a
canister and an empty I/O expansion slot within the canister in which to
hot add an adapter. (e.g., the third slot in the second canister). In a
next step 806, Maestro 208 determines whether any adapters currently exist
in the selected canister. Maestro 208 makes this determination by
accessing the slotAdapterPresence variable (e.g.,
1.3.6.1.4.1.837.2.4.1.1.5.2.n, which references a variable representing
the status of occupancy of a slot in the second canister, and where n
represents the number of the slot 1-4 within the second canister). Maestro
208 examines the slotAdapterPresence for each slot in the canister. If, in
the step 806, Maestro 208 determines there are no adapters currently in
any slots of the selected canister, then, in a step 808, Maestro 208
generates a request to shut off the power to the selected canister.
Maestro 208 uses the canisterPowerState MIB variable (e.g.,
1.3.6.1.4.1.837.2.5.2.1.6.2--referencing the power state of the second
canister) and a value =2 (i.e., Power OFF) to generate the request.
Maestro 208 sends the power shutoff request over the network 200 to the
server computer 204.
The SNMP extension agent 210 receives the power shutoff request. The SNMP
extension agent 210 calls the Intrapulse manager 220 passing parameters
identifying the selected canister and indicating the desired state of
power to the canister (i.e., shut off). The Intrapulse manager 220
accesses reserved memory corresponding to registers related to the
canisters in the server computer 204 and sets the value of a register in
that reserved memory, the setting of the register causing power to be shut
off to the selected (e.g., second) canister. The Intrapulse manager 220
returns a successful completion code. The SNMP extension agent 210 sends a
successful completion response message to the client computer 202.
In a next step 810, Maestro 208 prompts the network administrator to add an
adapter to the selected empty I/O expansion slot of the selected canister.
The network administrator then physically adds an adapter to the selected
slot of the selected canister. The network administrator loads a device
driver for the adapter on the server computer 204 if the device driver is
not already on the server computer 204, and the network administrator
configures the device driver to service the adapter if the device is not
already configured to do so. One of ordinary skill in the art will
recognize that Maestro 208 (or other management software module) may, with
an appropriate set of hot plug MIB variables, determine a device driver
for the adapter, locate that device driver, load the device driver if it
is not already loaded, and configure the device driver to service the
adapter if it is not already so configured. The network administrator then
indicates to Maestro 208 that an adapter card has been added.
In a further step 812, Maestro 208 generates a request to restore power
similar to the power shutoff request generated in the step 808, different
only in that the power state is requested to be on. Thus, the value of 1
(i.e., power ON) is combined with the canisterPowerState MIB variable
(e.g., 1.3.6.1.4.1.837.2.5.2.1.6.2) to generate an SNMP request of the
type SET.sub.-- REQUEST. Maestro 208 sends the request to the server
computer 204. The SNMP extension agent 210 receives the power on request
and calls the Intrapulse manager 220 to restore power to the canister.
When Maestro 208 receives the completion message indicating that power has
been resumed to the canister, the verification step 618 as described in
relation to FIG. 6, is performed.
If, in the step 806, Maestro 208 determines that at least one adapter
exists in the selected canister, then Maestro 208 accesses the hot plug
MIB variables 212 to determine whether all adapters existing in the
canister support hot swap operations. Maestro 208 accesses the MIB
variable adapterSupportHotSwapHotAdd (e.g., 1.3.6.1.4.1.837.2.2.1.1.3.n
where n=logical number of adapter, referencing the capability of an
adapter to support hot swap operations) to determine, in a further step
814, whether each adapter in the canister supports hot swap operations.
If at least one adapter does not support hot swap, then Maestro 208, in a
step 816, warns the network administrator that one of the adapters in the
canister does not support hot swap operations. Maestro 208 then, in the
step 816, queries the network administrator whether to proceed anyway. If
the network administrator decides not to proceed, then, in a step 818, the
hot add process aborts.
If, in the step 816 the network administrator decides to proceed, then, in
a step 820, Maestro 208 generates a request to force suspended operation
of device drivers servicing the adapters existing in the selected
canister. Maestro 208 uses the hot plug MIB variable adapterCommand (e.g.,
1.3.6.1.4.1.837.2.2.1.1.5.n where n is the logical number of the selected
adapter) and the value 3 (Force Suspend) to issue the force suspended
operations request. Note, that under some operating systems, such as
Windows NT, an adapter may be hot swapped when a device driver servicing
the adapter can be suspended and resumed.
If, in the step 814, Maestro 208 determines that all adapters existing in
the selected canister support hot swap operations, then, in a step 822,
Maestro 208 uses the adapterCommand MIB variable to generate a request to
suspend operation of any driver servicing any existing adapter in the
selected canister. In a further step 824, the SNMP extension agent 210
receives the request to suspend operations and calls the configuration
manager 218 to suspend operation of any drivers servicing existing
adapters in the selected canister.
In a step 826 Maestro 208 generates one or more requests using the
adapterState MIB variable to determine whether all drivers were suspended.
Maestro 208 sends the requests to the server computer 204. The SNMP
extension agent 210 receives the requests, makes respective calls to the
configuration manager 218, and determines whether drivers servicing
adapters in the selected canister were suspended, and generates response
messages indicating whether drivers were suspended and sends those
response messages to the client computer 202.
If, in the step 826, Maestro 208 determines from the response message sent
by the SNMP extension agent 210, that all drivers were not suspended, then
in a next step 828, Maestro 208 warns the network administrator that all
drivers were not suspended and queries the network administrator whether
to proceed anyway. If, in the step 828, the network administrator
determines not to proceed, then in a further step 830, the hot add process
aborts.
If, in the step 828, the network administrator decides to proceed anyway,
then Maestro 208 again, in the step 822, issues requests to suspend device
drivers servicing adapters in the selected canister. The steps 822-826
loop a minimum of five times as long as Maestro 208 determines that at
least one device driver has not been suspended. At the end of the fifth
iteration through the steps 822-826, Maestro 208, in the step 828, again
warns the network administrator that not all device drivers were
suspended. Programmatic branching from the step 828 repeats as described
above.
If, in the step 826, all drivers were suspended, then in a step 832 the
steps as described in 808-812 are performed. In a next step 834, Maestro
208 generates a request to resume operations on the suspended drivers. To
generate that request, Maestro 208 utilizes the hot plug MIB variable
adapterCommand. Maestro 208 sends the resume operations request to the
server computer 204.
The SNMP extension agent 210 receives the resume operation request and
calls the configuration manager 218 requesting that drivers associated
with the adapters in the canister be resumed. The configuration manager
218 returns a successful completion code upon resuming the operation of
the suspended drivers. The SNMP extension agent 210 sends a successful
completion response message to the client computer 202. In a step 836,
Maestro 208 verifies that the adapters in the selected canister are
working normally according to the procedures described in step 618 (see
FIG. 6) performed for each adapter in the selected canister.
Those of ordinary skill in the art will appreciate that the steps described
in relation to FIG. 8 also perform hot swap operations on a server
computer having canisters, the steps modified, however, as follows: steps
806-812 are not performed, and thus the step 814 of determining whether
the adapters in the selected canister support hot swap operations occurs
immediately after the step 804 wherein the network administrator selects a
canister and I/O expansion slot with respect to which the hot swap
operation will be performed.
FIG. 10 illustrates steps performed to hot swap an adapter in a server
computer that does not have canisters. In a first step 1002, Maestro 208
accesses adapter and I/O expansion slot information from the hot plug MIB
variables 212 and displays information identifying the adapter cards in
I/O expansion slots. The displayed information corresponds to the current
configuration of the server computer 204.
In a next step 1004, the network administrator selects an adapter to hot
swap. In a further step 1006, Maestro 208 accesses the hot plug MIB
variable adapterSupportHotSwapHotAdd (e.g., 1.3.6.1.4.1.837.2.2.1.1.3.n
where n=logical number of adapter, referencing the capability of an
adapter to support hot swap operations) to determine whether the selected
adapter supports hot swap operations. If the adapter does not support hot
swap operations, then in a further step 1008, Maestro 208 warns the
network administrator that the adapter does not support hot swap
operations and queries the network administrator whether to proceed
anyway. If, in the step 1008, the network administrator decides not to
proceed, then, in a step 1010, the hot swap procedure aborts.
If, in the step 1008, the network administrator decides to proceed, then,
in a further step 1012, Maestro 208 generates a request to force suspended
operation of the device driver servicing the selected adapter. Maestro 208
uses the hot plug MIB variable adapterCommand (e.g.,
1.3.6.1.4.1.837.2.2.1.1.5.n where n is the logical number of the selected
adapter) and the value 3 (Force Suspend) to issue the force suspended
operations request. The SNMP extension agent 210 receives the request and
performs the force suspend operation by calling the configuration manager
218. The configuration manager 218 returns a completion code, and the SNMP
extension agent 210 sends a successful completion response message to the
client computer 202.
If, in the step 1006, Maestro 208 determines that the adapter does support
hot swap operations, then, in a step 1014, Maestro 208 uses the
adapterCommand hot plug MIB variable to generate a request to suspend
operation of the device driver servicing the selected adapter. The SNMP
extension agent 210 receives the request and calls the configuration
manager 218 to perform the suspension. The SNMP extension agent 210 sends
a successful completion response message to the client computer 202.
In a further step 1016, Maestro 208 uses the hot plug MIB variable
adapterState to determine whether the adapter was suspended. Maestro 208
sends the request to the server computer 204 where the SNMP extension
agent 210 receives the request and calls the configuration manager 218.
The configuration manager 218 tests the state of the device driver and
returns a code to the SNMP extension agent 210 indicating the state of the
device driver (e.g., suspended or operating). The SNMP extension agent 210
generates a response message indicating whether the device driver is
suspended and sends the response message to the client computer 202.
In the step 1016, if Maestro determines that the device driver was not
suspended (by examining the response message), then, in a next step 1018,
Maestro 208 warns the network administrator that the device driver was not
suspended and queries the network administrator whether to proceed anyway.
If, in the step 1018, the network administrator determines not to proceed,
then, in a next step 1020, the hot swap procedure aborts.
If, in the step 1018, the network administrator determines to proceed even
though the device driver was not suspended, the steps 1014 and 1016 loop a
minimum of five times as long as Maestro 208 determines that the device
driver has not suspended. Programmatic branching repeats at step 1018 as
described above.
If Maestro 208 determines, in the step 1016, that the device driver was
suspended, then, in a step 1022, each of the steps 606-616 (See FIG. 6) is
performed (describing detailed steps for shutting down power to the
selected I/O expansion slot, placing an adapter in the expansion slot, and
restoring power to the selected I/O expansion slot; the only difference
being that after power shuts off to the I/O expansion slot, the existing
adapter is removed before placing a different adapter in the expansion
slot).
In a step 1024, Maestro generates a request to resume operations on the
suspended device driver. Maestro 208 uses the adapterCommand MIB variable
to generate that request. Maestro 208 sends the request to the server
computer 204. The SNMP extension agent receives the resume operation
request and calls the configuration manager 218 to resume operation of the
identified device driver. The configuration manager 218 returns a
successful completion code upon resuming the operation of the driver. The
SNMP extension agent 210 sends a successful completion response message to
the client computer 202. In a step 1026, Maestro 208 verifies that the
newly added adapter and device driver are working normally according to
the procedure described in step 618 (see FIG. 6).
This invention may be embodied in other specific forms without departing
from the essential characteristics as described herein. The embodiments
described above are to be considered in all respects as illustrative only
and not restrictive in any manner. The scope of the invention is indicated
by the following claims rather than by the foregoing description. Any and
all changes which come within the meaning and range of equivalency of the
claims are to be considered within their scope.
##SPC2##
Top