Back to EveryPatent.com



United States Patent 5,751,992
Bhargava ,   et al. May 12, 1998

Computer program product for continuous destaging of changed data from a shared cache in a multisystem shared disk environment wherein castout interest is established in a hierarchical fashion

Abstract

A computer program product for a system and method for destaging changed data from a shared cache castout in a shared data environment distributes castout ownership for a dataset among a plurality of systems having an update interest in the dataset. One system is designated as having castout ownership of the dataset in a primary state. All other systems having an update interest in the dataset hold castout ownership for the dataset in a backup state. When the primary owner relinquishes ownership, primary ownership is passed to one of the backup systems. A structure castout owner for the entire cache assumes the responsibility of castout out changed data for a dataset for which the primary castout owner has had a failure.


Inventors: Bhargava; Gautam (Cupertino, CA); Narang; Inderpal Singh (Saratoga, CA); Teng; James Z. (San Jose, CA)
Assignee: International Business Machines Corporation (Armonk, NY)
Appl. No.: 454846
Filed: May 31, 1995

Current U.S. Class: 711/130; 707/8; 709/213; 711/113; 718/102
Intern'l Class: G06F 012/02
Field of Search: 395/182.03,478,732,438,457,200.07,200.08,200.03,440


References Cited
U.S. Patent Documents
4399504Aug., 1983Obermarck et al.364/200.
4422142Dec., 1983Inabu et al.395/308.
4610013Sep., 1986Long et al.371/9.
4965719Oct., 1990Shoens et al.364/200.
5129089Jul., 1992Nielsen395/725.
5142638Aug., 1992Schiffleger395/478.
5280611Jan., 1994Mohan et al.395/608.
5287473Feb., 1994Mohan et al.395/460.
5339427Aug., 1994Elko et al.395/725.
5423044Jun., 1995Sutton et al.395/725.
5459862Oct., 1995Garliepp et al.395/600.
5488716Jan., 1996Schneider et al.395/182.
5491787Feb., 1996Hashemi395/182.
5493668Feb., 1996Elko et al.395/457.


Other References

Bhargava, G. et al., "Relational Database Systems With Zero Information Loss," IEEE Transactions On Knowledge And Data Engineering, vol. 5, Issue 1, Feb. 1993, pp. 76-87.
Date, C.J., "An Introduction To Database Systems, Vol. II," QA76.9,D3D367, Addison-Wesley, Massachusetts, 1983, pp. 291-341.
IBM Technical Disclosure Bulletin, "Distributed Locking Facility For Multi-System Data Sharing," vol. 31, No. 1, Jun., 1988, pp. 206-211.
Mohan, C. et al., "Data Base Recovery In Shared Disks And Client-Server Architectures," Proceedings of the 12th International Conference on Distributed Computing Systems, Jun. 1992, pp. 310-317.

Primary Examiner: Chan; Eddie P.
Assistant Examiner: Yip; Vincent
Attorney, Agent or Firm: Dawkins, Esq.; Marilyn Smith Sterne, Kessler, Goldstein & Fox P.L.L.C.

Parent Case Text



This application is a division of application Ser. No. 08/311,755 filed Sep. 23, 1994, (status:pending).
Claims



Having thus described our invention, what we claim as new and desire to secure by Letters Patent is:

1. A computer program product for use with a computer system in a shared data environment, comprising:

a computer usable medium having computer readable program code means embodied in said medium for distributing a castout process for datasets among a plurality of systems in a shared data environment, said computer readable program code means comprising:

computer readable first program code means for enabling a computer to assign a first system attempting to perform an update of a dataset in a cache shared by said plurality of systems a primary castout interest for said dataset, wherein said first computer system maintains responsibility for destaging said dataset from said shared cache to a memory shared by said plurality of systems; and

computer readable second program code means for assigning any one or more second systems attempting to perform a subsequent update of said dataset in said shared cache a backup castout interest for said dataset, wherein said one or more second systems maintain responsibility for destaging said dataset from said shared cache to said shared memory in the event that said first computer system fails,

wherein said first system gives said one or more second systems an opportunity to accept or decline said primary castout interest for said dataset and passes said primary castout interest thereto if said second systems accept said primary castout interest for said dataset.

2. A computer program product for use with a computer system in a shared data environment, comprising:

a computer usable medium having computer readable program code means embodied in said medium for distributing a castout process for datasets among a plurality of systems in a shared data environment, wherein each system has a processor and a memory, and wherein the plurality of systems share a high-speed electronic storage and a low-speed direct access storage, said computer readable program code means comprising:

computer readable code means for enabling a computer to assign a castout lock for a dataset in a backup state to a first computer system having an update interest in the dataset, wherein said first computer system is tasked with casting out an updated version of the dataset from said high-speed electronic storage to said low-speed direct access storage in the event that a system having a castout lock for the dataset in a primary state fails;

said computer readable code means for enabling a computer to assign a castout lock for the dataset in a primary state to said first computer system if no other of said systems holds castout lock in said primary state, wherein said first computer system is tasked with casting out an updated version of the dataset from said high-speed electronic storage to said low-speed direct access storage, and

wherein a second computer system having castout lock for the dataset in said backup state is given an opportunity to accept or decline castout lock for the dataset in said primary state if said first computer system fails, and

wherein castout lock for the dataset in said primary state is passed to said second computer system if said second computer system accepts castout lock for the dataset in said primary state.

3. The computer program product of claim 2, wherein said computer readable-code means further comprises:

computer readable code means for enabling a computer to notify said second computer system having castout lock for the dataset in said backup state that castout lock for the dataset in said primary state is available; and

computer readable code means for enabling a computer to pass a castout lock for the dataset in said primary state to said notified second computer system.

4. The computer program product of claim 2, wherein said computer readable code means further comprises computer readable code means for enabling a computer to cast out the dataset and fold the castout process if no computer system has castout lock for the dataset in said backup state.

5. The computer program product of claim 2, wherein said computer readable code means further comprises:

computer readable code means for enabling a computer to assign structure castout ownership to a third computer system;

computer readable code means for enabling a computer to cause said assigned third computer system to compile a list of datasets having dirty data; and

computer readable code means for enabling a computer to cause said assigned third computer system to notify owners of said dirty datasets to execute a castout process for the datasets thereof.

6. The computer program product of claim 5, wherein said structure castout owner determines a timestamp for recovery.

7. A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform method steps for distributing a castout process for datasets among a plurality of systems in a shared data environment, said method steps comprising the steps of:

assigning a first system attempting to perform an update of a dataset in a cache shared by said plurality of systems a primary castout interest for said dataset, wherein said first computer system maintains responsibility for destaging said dataset from said shared cache to a memory shared by said plurality of systems; and

assigning any one or more second systems attempting to perform a subsequent update of said dataset in said shared cache a backup castout interest for said dataset, wherein said one or more second systems maintain a responsibility for destaging said dataset from said shared cache to said shared memory in the event that said first computer system fails,

wherein said one or more second systems are given an opportunity to accept or decline said primary castout interest for said dataset if said first system fails, and

wherein primary castout interest for said dataset is passed to said one or more second systems if said one or more second systems accept primary castout interest for said dataset.

8. A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform method steps for distributing a castout process for datasets among a plurality of systems in a shared data environment, wherein each system has a processor and a memory, and wherein the plurality of systems share a high-speed electronic storage and a low-speed direct access storage, said method comprising:

assigning a first computer system having an update interest in a dataset castout lock for the dataset in a backup state, wherein said first computer system is tasked with casting out an updated version of the dataset from said high-speed electronic storage to said low-speed direct access storage in the event that a system having a castout lock for the dataset in a primary state fails;

assigning said first computer system castout lock for the dataset in a primary state if no other of said systems holds castout lock in said primary state, wherein said first computer system is tasked with casting out an updated version of the dataset from said high-speed electronic storage to said low-speed direct access storage;

giving a second computer system having castout lock for the dataset in said backup state an opportunity to accept or decline castout lock for the dataset in said primary state if said first system fails, and

passing castout lock for the dataset in said primary state to said second computer system if said second computer system accepts castout lock for the dataset in said primary sate.

9. The computer program storage device of claim 8, wherein said step of causing said first computer system to pass castout lock comprises the steps of:

notifying said second computer system having castout lock for the dataset in said backup state that castout lock for the dataset in said primary state is available; and

passing castout lock for the dataset in said primary state to said notified second computer system.

10. The computer program storage device of claim 8 wherein said method for distributing a castout process further comprises the step of casting out the dataset and folding the castout process if no computer system has castout lock for said dataset in the backup state.

11. The computer program storage device of claim 8, wherein said method for distributing a castout process further comprises the steps of:

assigning structure castout ownership to a third computer system;

causing said assigned third computer system to compile a list of datasets having dirty data; and

causing said assigned third computer system to notify owners of said dirty datasets to cute a castout process for the datasets thereof.

12. The computer program storage device of claim 11, wherein said method for distributing a castout process further comprises the step of said structure castout owner determining a timestamp for recovery.

13. A computer program product for use with a data processing system for distributing a castout process for datasets among a plurality of computer systems in a shared data environment, wherein each said system has a processor and a memory, and wherein the plurality of systems share a high-speed electronic storage and a low-speed direct access storage, comprising:

a computer useable medium having a computer readable program means embodied in said medium for distributing a castout process among said plurality of computer systems, said computer readable program code means comprising:

a computer readable program code means for enabling a computer to assign a castout lock for a dataset in a backup state to a first computer system having an update interest in the dataset, wherein said first computer system is tasked with casting out an updated version of the dataset from said high-speed electronic storage to said low-speed direct access storage in the event that a system having a castout lock for the dataset in a primary state fails;

said computer readable program code means for enabling a computer to assign a castout lock for the dataset in a primary state to said first computer system if no other of said systems holds castout lock in said primary state, wherein said first computer system is tasked with casting out an updated version of the dataset from said high-speed electronic storage to said low-speed direct access storage,

wherein a second computer system having castout lock for the dataset in said backup state is given an opportunity to accept or decline castout lock for the dataset in said primary state if said first computer system fails, and

wherein castout lock for the dataset in said primary state is passed to said second computer system if said second computer system accepts castout lock for the dataset in said primary state.

14. The computer program product of claim 13, wherein said computer readable program code means for passing castout lock for the dataset in the primary state to said second computer system having castout lock for the dataset in said backup state further comprises:

computer readable program code means for enabling a computer to notify said second computer system having castout lock for the dataset in said backup state that castout lock in said primary state is available; and

computer readable program code means for enabling a computer to pass castout lock for the dataset in said primary state to said notified second computer system.

15. The computer program product of claim 13, wherein said computer readable program code means further comprises computer readable program code means for enabling a computer to cast out the dataset and fold the castout process if no system has castout lock for the dataset in said backup state.

16. The computer program product of claim 13, wherein said computer readable program code means further comprises:

computer readable program code means for enabling a computer to assign structure castout ownership to a third computer system;

computer readable program code means for enabling a computer to compile a list of datasets having dirty data; and

computer readable program code means for enabling a computer to notify owners of said dirty datasets to execute a castout process for said datasets thereof.

17. A computer program product for use with a computer system, the computer system comprising:

a plurality of processors;

data storage means common to said processors for storing datasets shared among said plurality of processors; and

cache means common to said processors for updating said datasets for said plurality of processors,

wherein each said processor having an update interest in a changed dataset has a castout process for transmitting said changed dataset from said cache means to said data storage means and deleting an entry for said dataset from said cache means, said computer program product comprising:

a computer usable medium having computer readable program code means embodied in said medium for distributing a castout process among a plurality of systems, said computer readable program code means comprising:

computer readable program code means for enabling a computer to assign a first of said plurality of processors a primary interest in casting out a changed dataset, wherein said first system maintains primary responsibility for performing said castout process; and

computer readable program code means for enabling a computer to assign a second of said plurality of processors a backup interest in casting out a changed dataset, wherein said second system is provided primary responsibility for performing said castout process if said first system fails,

wherein said first of said plurality of processors gives said second of said plurality of processors an opportunity to accept or decline said primary interest in casting out said changed dataset and passes said primary interest thereto if said second of said plurality of processors accepts said primary interest.

18. A computer program product for use with a data processing system, said data processing systems distributing a castout process for datasets among a plurality of computer systems in a shared data environment, wherein each system has a processor and a memory, and wherein the plurality of systems share a high-speed electronic storage and a low-speed direct access storage, the data processing system comprising:

one or more application processes, each said application process running in one of said plurality of systems and having an update interest in a dataset; and

a shared data storage means for storing the dataset for access by said one or more application processes, said program storage device tangibly embodying a program of instructions executable by the computer system to perform a method for distributing a castout process among said plurality of systems, said method comprising:

assigning a first computer system having an update interest in the dataset castout lock for the dataset in a backup state, wherein said first computer system is tasked with casting out an updated version of the dataset from said high-speed electronic storage to said low-speed direct access storage in the event that a system having a castout lock for the dataset in a primary state fails;

assigning said first computer system castout lock for the dataset in a primary state if no other of said systems holds castout lock in said primary state, wherein said first system is tasked with casting out an updated version of the dataset from said high-speed electronic storage to said low-speed direct access storage,

giving a second computer system having castout lock for the dataset in said backup state an opportunity to accept or decline castout lock for the dataset in said primary state if said first system fails, and

passing castout lock for the dataset in said primary state to said second computer system if said second computer system accepts castout lock for the dataset in said primary state.

19. The program storage device of claim 18, wherein said step of passing castout lock for the dataset in the primary state to a second system having castout lock for the dataset in the backup state further comprises:

notifying said second computer system that castout lock for the dataset in said primary state is available; and

passing castout lock for the dataset in said primary state to said notified second computer system.

20. The program storage device of claim 18, wherein said method steps further comprise the step of casting out the dataset and folding the castout process if no system has castout lock for the dataset in said backup state.

21. The program storage device of claim 18, wherein said method steps further comprise the steps of:

assigning structure castout ownership to a third system;

compiling a list of datasets having dirty data; and

notifying owners of said dirty datasets to execute a castout process for said datasets thereof.
Description



CROSS-REFERENCE TO OTHER APPLICATION(S)

The following application of common assignee is related to the present application, and is believed to have an effective filing date identical with that of the present application:

U.S. Patent Application entitled System And Method For Including Stored Procedures, User-Defined Functions, And Trigger Processing In An Existing Unit Of Work, by Curt L. Cotner, Robert J. Heaps, Wendy L. Koontz, Benjamin R. Little, and Thomas J. Toomire, Ser. No. 08/311,035 (filed Sep. 23, 1994) incorporated herein by reference in its entirety.

1. Technical Field

The present invention relates generally to shared disk systems and more particularly to a system and method for continuous destaging of changed data from a shared cache.

2. Background Art

With the advent of shared data systems came the requirement that data updates be propagated to all systems sharing those data. Several techniques have been developed in response to this requirement. One such technique utilizes a high speed electronic cache for propagating data updates among the various systems in the data sharing facility. One example of such a cache is a coupling facility available from International Business Machines, Inc.

In a shared data facility implementing this cache technique, when. one system updates data (for example a page of data in a data file), that system forwards the updated data (e.g., the updated page) to the cache. However, due to the high costs typically associated with cache storage, the size of such a cache is much smaller than the capacity of the shared disk(s). Consequently, a process is provided to periodically destage, or cast out changed data from the shared cache. Such a process is commonly known as a castout process. In this environment, the system performing the update sends the updated data (e.g., the page) to the cache. At some later moment, the castout process fetches the updated data from the cache and replaces the corresponding data on the shared disk.

SUMMARY OF THE INVENTION

The present invention is directed to a system and method for distributing the task of destaging, or casting out, changed data in a shared cache. According to the invention, each system that has an update interest in a dataset is capable of acquiring a castout ownership interest in this dataset. Thus, the invention provides for continuous destaging of changed data from the shared cache in the presence of failures of systems using the cache. The destaging of cached data is distributed among the systems for the purpose of load balancing and for backup in the event of a system failure.

A hierarchical structure for castout ownership is provided. The systems using the shared cache for their shared data allocate a piece of the cache referred to as the cache structure. This cache structure is used for caching data for one or more datasets which are shared by multiple systems. One of these systems using the cache structure becomes the "structure owner"and has the responsibility for initiating castouts at the dataset level.

Systems can hold a castout ownership interest in the dataset at one of two levels: primary and backup. At any one time, only one system can hold a primary castout ownership interest; all other systems having castout ownership have such ownership in the backup state.

The first time a system updates a shared dataset, that system attempts to acquire castout ownership. This system is granted primary castout ownership, but only if no other system holds primary castout ownership for that dataset. Otherwise, the system acquires a backup ownership interest in the dataset.

When a system having castout lock in a dataset loses its update interest in that dataset, the system relinquishes castout lock for that dataset. If the system held lock in the backup state, no further action is required. If, however, the system held a primary ownership interest in the dataset, primary ownership must pass to a backup system. If no backup system exists, all changed data from the dataset are castout by the primary system and the castout process for that dataset is folded up before the primary system relinquishes ownership. If one or more backup systems do exist, primary ownership responsibility is passed on to one of these systems.

A structure castout ownership level is provided whereby a system is designated as having castout responsibility for the whole cache structure in the shared cache. The system having castout ownership at the structure level accomplishes its tasks by calling on systems having castout ownership (preferably primary ownership) for the datasets. Periodically, a process within the structure castout owner takes a checkpoint to determine a timestamp for recovery purposes. At checkpoint, the structure castout owner attempts to castout all dirty data for the cache structure used by the system.

In the event of a failure of the primary castout owner for a dataset, the structure castout owner detects the absence of the primary castout owner and assumes the responsibility for casting out changed data of that dataset.

An advantage of the invention is that castout responsibility is distributed among several systems. Thus, bottlenecks caused by an overloaded system can be avoided. A further advantage is that by distributing the castout workload, the impact on transaction processing is minimized and protection is provided against loss of the castout feature.

One object of the invention is to have an uninterrupted service of casting out changed pages from the shared cache regardless of a failure of individual system(s) performing that task.

Further features, advantages and objects of the present invention, as well as the structure and operation of various embodiments of the present invention, are described in detail below with reference to the accompanying drawings.

The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of (a) preferred embodiment(s) of the invention, as illustrated in the accompanying drawing(s).

BRIEF DESCRIPTION OF DRAWINGS

The present invention is described with reference to the accompanying drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Additionally, the left-most digit(s) of a reference number identifies the drawing in which the reference number first appears.

FIG. 1 is a block diagram illustrating a shared data environment;

FIG. 2 is an operational flow diagram illustrating a method by which primary or backup castout ownership is established for a dataset;

FIG. 3 is an operational flow diagram illustrating a method by which a system can relinquish castout lock on a dataset;

FIG. 4 is an operational flow diagram illustrating a method by which a system manages structure castout;

FIG. 5 is an operational flow diagram illustrating a method by which a system having primary structure castout lock acquires castout lock for a dataset.

DETAILED DESCRIPTION

The present invention is directed toward a system and method for distributing the castout work of a shared cache process among several systems (e.g., DBMS, or data base management systems) in a shared data facility. This distribution is also used to manage recovery from failed castout processes caused by system failures. According to the invention, when a system is required to update data (such as a page in a dataset), it attempts to become a castout owner for that dataset. As the castout owner, that system contains the castout process for that dataset.

Because multiple systems may wish to update the same dataset, and hence, attempt to become a castout owner, the invention provides for the establishment of a castout owner hierarchy and dynamic assignment of castout ownership within this hierarchy. According to this hierarchy, one system is the primary castout owner for the dataset and all others are backup owners for this dataset. It is the system that is the primary owner that contains the castout process. If the primary owner relinquishes its castout interest, a backup owner is upgraded to the status of a primary owner. If there is no backup owner for that dataset, a cache structure owner eventually assumes the responsibility for casting out changed data for that dataset.

FIG. 1 is a block diagram illustrating an example of a shared data facility 100 in which the invention can be implemented. According to this example, shared data facility 100 includes several systems 104A-104N (generally referred to as systems 104), a shared disk 108 and a coupling facility 112 which contains a cache structure 113. Systems 104 running application programs share data stored in shared disk 108. Data updates are performed using coupling facility 112. Systems 104 can be collocated in a single location or plant. Alternatively, systems 104 can be located at various different sites physically removed from each other.

In the description provided herein, data updates are discussed in terms of data in data files (or, using IBM terminology, datasets). After reading the description provided herein, it will become apparent to a person skilled in the relevant art how to implement the invention using other data structures.

If coupling facility 112 were to be implemented as a write through cache, data updates would be automatically forwarded to shared disk 108. However, where coupling facility 112 is not a write through facility, the updates of data (e.g., a page) in a dataset are first stored in the cache structure 113. At some later time, the updated data are read by a system 104 and the shared disk 108 version of the data is replaced. Thus, conventional systems assign the castout process to a single system which handles the updates for the entire cache structure 113. There are, however, disadvantages with such a conventional arrangement. First, if the system 104 that contains the castout process goes down or is otherwise unavailable, the castout cannot take place. If the castout feature is lost, data-sharing can potentially be disabled because there is no longer a way to propagate updates across the systems 104. Second, in a data sharing facility 100 handling a high transaction volume, it is important not to overburden any one system 104 with the task of performing castout.

To get around these drawbacks, the castout workload is distributed among multiple systems 104. By distributing the castout workload, the impact on transaction processing is minimized (i.e., CPU cycles used up by castout), and protection is provided against loss of the castout feature due to a failure on the system 104 that is responsible for castout.

In one embodiment, the services of a lock manager, a commonly available service in transaction processing systems, are utilized to distribute the castout workload among systems 104. Responsibility for the castout of a particular dataset is assigned to a particular system 104. Because each system 104 is susceptible to failure, a scheme is provided whereby the castout responsibility can be transferred to another system 104. According to the invention, castout responsibility is assigned according to the dataset (e.g., file or pageset) update interest. Data in a given dataset are castout by the system 104 that has established itself as the primary castout owner for that file. This system can be considered the "owner"of the castout function for that file.

Because the castout responsibility is distributed among multiple systems 104, there can be more than one castout owner for the entire collection of files. For each of these datasets, only one system 104 is designated as having primary castout responsibility for each dataset. All other systems 104 (other than primary) having an update intereset for a dataset have backup castout responsibility in that dataset.

The first time a system 104, for example system 104A, updates data in a shared dataset, that system 104A attempts to become the castout owner for that dataset. It does so by acquiring a castout lock in the backup state. If no other system 104 is the primary castout owner for that dataset, system 104A acquires primary castout ownership. In this case, system 104A is said to have castout lock in the primary state. If another system 104 already has primary responsibility for that dataset, system 104A acquires backup castout ownership. In this second case, system 104A acquires castout lock in the backup state.

The process of acquiring castout lock is now described. FIG. 2 is an operational flow diagram illustrating the process of acquiring castout lock. Referring now to FIG. 2, in a step 204, a system 104 (e.g., 104A) opens a dataset for the first update of that dataset and attempts to acquire castout responsibility for that dataset. In a step 208, system 104A acquires castout lock in the backup state for that dataset. Because there can be numerous systems 104 having castout lock in the backup state, such lock is granted in step 208 unconditionally.

In a step 212, system 104A attempts to acquire castout lock in the primary state for the subject dataset. If the primary state is available in step 216 (i.e., if no other system 104 has primary castout ownership for that dataset), system 104A acquires castout lock in the primary state as illustrated by step 220. If, on the other hand, the primary state is not available in step 216, system 104A maintains castout lock in the backup state as illustrated by step 224.

This protocol of first becoming a backup castout owner and then primary ensures that each system having an update interest in the dataset can become the primary castout owner for that dataset if no other primary exists. This is because by first becoming the backup, the system 104 gets on the lock manager's notification list for becoming the primary in the event the primary fails or relinquishes its responsibility.

According to one embodiment, a system 104 acquires dataset castout responsibility only if it has an update interest in the dataset, and not if it has read interest only. This protocol is advantageous where a dataset is to be opened in a read/write mode only when a system 104 has an update interest. Thus, a system 104 with a read-only interest would not be a good candidate for writing castout data to the associated dataset on shared disk 108.

When a system 104 having castout lock in a dataset loses its update interest, it relinquishes castout lock. The process of relinquishing castout lock is described with reference to the operational flow diagram depicted in FIG. 3.

Referring now to FIG. 3, in a step 304, a system 104 having castout lock for a dataset (e.g., system 104A) relinquishes its update interest in that dataset. In a step 308, that system 104A relinquishes castout lock for that dataset. If system 104A held lock in the backup state (decision block 312), no further processing takes place in other instances of systems 104. If, however, system 104A held lock in the primary state (decision block 312), primary ownership must pass to a backup system 104. However, if no backup system 104 exists (decision block 316), the dataset is castout by system 104A and the castout process is folded up as illustrated in a step 320.

If a backup system 104 exists, in a step 324 the global lock manager notifies the backup system 104 (e.g., system 104B) that it may now hold lock in the primary state. If the backup system 104B accepts this primary responsibility for the dataset (decision block 328), system 104B becomes the primary owner of the castout responsibility for this dataset as illustrated in a step 332.

When the primary owner relinquishes its interest, the global lock manager informs the primary owner as to the state of the castout lock held by other systems 104 for that dataset.

Note that system 104B may refuse to accept primary castout responsibility for the dataset as illustrated by decision block 328. This may be because system 104B is overloaded and does not want to accept castout responsibility. However, if system 104B is the only remaining system 104 that holds castout lock for this dataset, it is not allowed to refuse primary castout responsibility.

In one embodiment, some systems may be disabled from having castout responsibility, or even disabled from having primary castout responsibility, due to cost or performance constraints. For example, due to communications costs and bandwidth constraints, it would be costly and/or inefficient to assign primary castout responsibility to a system 104 that is located at a very far distance from the shared disk 112.

In one embodiment, the castout feature of the invention uses a two-level hierarchy of systems 104 that communicate via messages. At the lower level of the hierarchy are system 104 that are in charge of castouts for datasets. At the higher level is a system 104 that controls castout responsibility for the whole cache structure 113. This system 104 at the higher level is referred to as the structure castout owner. The structure castout owner accomplishes its tasks by calling upon services of systems 104 at the lower level. Structure castout ownership is assigned by acquiring a structure castout lock in the primary state. Backup responsibility is assigned by acquiring structure castout lock in the backup state.

A method by which the structure castout owner manages cleanup of datasets with changed data is illustrated in FIG. 4. In a step 404, one system (e.g., system 104A) is assigned structure castout ownership in the primary state. Periodically, a process within structure castout owner system 104A takes a cache structure checkpoint. Such a checkpoint is preferably time driven. This is illustrated in step 408. The purpose of this checkpoint is to determine a timestamp for recovery purposes. Also, at the cache structure checkpoint, the system attempts to castout all data for cache structure 113.

At cache structure checkpoint, the structure castout owner system 104A starts processing a cache structure's directory to get a "min" timestamp for recovery purposes. This is illustrated by a step 412. This process is fully described in U.S. Pat. No. 5,280,611,to Mohan, et al, the full disclosure of which is incorporated herein by reference as if reproduced in full below.

While the structure castout owner system 104A scans the coupling facility directory, in a step 416, system 104A builds a list of datasets having updated data in the cache structure 113. In a step 420, the structure castout owner system 104A notifies the respective owner systems 104 of these datasets to castout all updated data for their datasets.

As stated above, in step 420 the structure castout owner system 104A notifies the primary castout owner of the dataset that there is dirty data. However, if the primary owner system 104A for this dataset has failed and if there is no backup owner, the castout function can only arise at the cache structure checkpoint. When, in step 420, the structure castout owner system 104A determines that there is no primary lock holder for the dataset, system 104A assumes temporary castout ownership for that dataset, casts out the data from the dataset, closes the dataset, and relinquishes the castout responsibility for that dataset. FIG. 5 is an operational flow diagram illustrating the process of assuming temporary castout ownership in greater detail.

As stated above, in step 420 the structure castout owner system 104A notifies the owner system 104 that there is dirty data. More specifically, as illustrated in a step 420, system 104A notifies the system having lock in the primary state for the specific dataset. If there is a system 104 having lock in the primary state (decision block 508), that system 104 handles the castout for the dataset. If, however, there is no system having lock in the primary state for the dataset, in a step 512, the structure castout owner system 104A physically opens the subject dataset.

In a step 516, the structure castout owner system 104A attempts to acquire castout lock for the dataset in the primary state. If this attempt fails (decision block 520), this indicates that another system has acquired primary lock in the dataset. Therefore, the operation continues at step 420 as illustrated by flowline 562. If, on the other hand, castout owner system 104A acquires castout lock for the dataset in the primary state, in step 524 castout owner system 104A casts out the data.

While various embodiments of the present invention have been described above, it should be understood that they have been presented by way of example only, and not limitation. Thus, the breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

BEST MODE FOR CARRYING OUT THE INVENTION

While the invention has been particularly shown and described with reference to (a) preferred embodiment(s) thereof, it will be understood by those skilled in the art that (various changes) (the foregoing and other changes) in form and details may be made therein without departing from the spirit and scope of the invention.


Top