Part II: Applications and Interactions:
Modeling Interactions

The relationship between health care entities in the NHIN can be understood by imagining five sample interactions:

  1. Transfer of patient's clinical data between entities in a single SNO
  2. Between entities in two different SNOs
  3. From an entity in a SNO to an unaffiliated entity
  4. From an entity in a SNO to a public health entity or other aggregator
  5. Subscription Models for the above 4 Interactions

1) Transfer of Clinical Data Between Two Entities in a
Single SNO

Step 1: Asking for Record Locations

A patient, Elizabeth Smith, presents at a clinic complaining of shortness of breath. She has never been seen there before, and after she provides basic demographic data, the clinic queries the local RLS for her records.4 Assuming the clinic is a member of the SNO, and has presented the proper credentials for authentication and authorization, the RLS will compare a set of identifiers, for example Ms. Smith's name, DoB, gender, Zip, and SSN, with those records whose locations are listed in the RLS database. Those record locations that have a sufficiently high probability of matching the patient data (as determined by a matching algorithm specifically designed for that purpose) will be returned from the RLS.

The RLS answers all queries from authorized sources within a SNO.

Step 2: Aggregating the Identified Records

Once the RLS has matched Ms. Smith's identifying details against record locations it contains in its database and returned information necessary to retrieve those records, its work is done. The next step is to use those locations to aggregate the actual records themselves, by querying the record locations for the actual patient data. Because the site of the aggregation has significant ramifications for the contractual relations within the SNO the site of aggregation of records can vary SNO by SNO. There are several options for aggregating the actual records whose locations have been returned by the RLS. The Markle Connecting for Health prototype tried three; others may be possible.

Client Aggregation: One method is to have the original requesting client do the aggregating. In this model, the client receives one or more record locations. (Zero locations returned is a failed search.) The client then decides which of these records it would like to attempt to retrieve, e.g., only records from a particular institution, or only records from labs, or all records.5 Client-side aggregation was tested in Massachusetts. The advantages of client aggregation are refined control over record requests, and possibly tighter integration with other local electronic data systems. The disadvantage is higher technical requirements at the participating entities in the SNO.

Central Aggregation Service: A second method of aggregation is to create a central aggregation service, a server that sits between the entities and the RLS itself. The server takes incoming queries and passes them on to the RLS, and then takes the record locations returned and queries each listed location, handing off only the final, aggregated record to the requesting client. Centralized aggregation was tested in Indiana. The advantages of such a service are that it creates economies of scale for the SNO. The disadvantages are less control of the record by the original requestor, and higher security risk, since the aggregation service will hold, even if only for a moment, considerable clinical data about the patient.6

Aggregation Proxy: A third possibility is to run a proxy service that can receive record locations and aggregate them, but is not a required service, allowing some clients to use the remote server for aggregation, and others that want to receive the record locations and do the aggregation locally to do so. The proxy aggregation model was tested in California. The advantages of such a system are that it allows aggregation or records to happen either centrally or at the requesting site. The disadvantage is that it potentially more complex, in terms of interaction, than the other two models.

In all three aggregation scenarios, some sort of authentication and authorization must be in place between the requestor and the source of the data, whether peer-to-peer (each entity authenticates directly to each other entity) or, more likely, through the operation of a SNO-wide directory service that allows the entities to identify one another. (See the section on Identity, Authentication, and Authorization, below.)

Importantly, under the Common Framework the actual sources of clinical data are not required to respond to any given request for that data Individual entities are the stewards of the records, and of the patient's expectation of confidentiality. As a result, those entities may add constraints on data access.7 Examples include added restrictions to a particular set of records at the patient's request, or added restrictions for anyone who does not have admitting privileges at a particular hospital.

Step 3: Displaying or Otherwise Using the Records

The third step involves taking the actual clinical records, wherever aggregated, and making them useful, whether by displaying them directly to the clinician, integrating them into an existing electronic health record (EHR) system, feeding them into a decision support tool, or any of a number of other present or future possible uses of the data.

A key aspect of the current model is that it places no constraints on how the records are made useful, other than to require that the consuming applications abide by policy requirements around privacy, security, auditing, etc. The role of the network is to carry useful data from existing sources to authorized requestors; whether that data is then displayed directly in a browser window or becomes part of a complex database transaction is entirely up to the local user.8 The goal of the Common Framework is to advance the conversation between application designers and users, by making data more accessible and better formatted. The Common Framework is not intended to either replace or interfere with those conversations

2) Transfer of Clinical Data Between Two Entities in Different SNOs

A similar scenario can occur when a clinic in one SNO (A) looks for a patient's records in a second SNO (B) of which the clinic is not a member. The basic three-step transaction is the same, with these differences:

Step 1: Asking for Record Locations

  1. In addition to the patient's demographic details, the clinic needs some information on which other SNOs to query, whether the name of an institution, an affiliation with a particular network (e.g., the VHA), or a region where she previously received care. There is no national index of patients; the reasons for this are discussed in the ISB section below.9
  2. All traffic leaving SNO A goes out through SNO A's ISB--the clinic does not make remote requests directly.
  3. Once a request is validated by the receiving ISB, that request is forwarded to the RLS in SNO B and handled as are internal requests, except that the response goes back to the ISB of SNO B for return to SNO A.
  4. All traffic coming into SNO B from entities that are not members of B comes in through B's ISB. This simplifies contact with the outside world, and provides a single spot for watching remote traffic (which has a lower level of trust than local traffic.)
  5. The trust model between SNOs specifically assumes that each SNO's ISB has a valid SSL certificate, and each SNO will accept the other's certificate.
  6. The requesting SNO must provide an identifier of the person authorizing the request. (See the policy document, “Authentication of System Users.”) The receiving SNO does not need (and will in most cases be unable to) re-authenticate the original requestor.

Step 2: Aggregating the Identified Records

  1. The clinic can, optionally, ask either for a set of pointers to the data, or can ask the ISB to act as an aggregator, and return the aggregated record directly. The ISB must support both types of request.
  2. Inter-ISB communications are always asynchronous. ISB A passes along the clinic's request for data to ISB B. B responds that it has received the request. When it is time to return data to A, it starts a transaction with A to deposit that data. Each ISB must therefore be able to both initiate outbound requests to other ISBs and to accept other transactions from ISBs.
  3. It is up to SNO A to determine how the material is to be transferred from its ISB back to the initial requesting entity. The ISB can require the original requestor to check back periodically; can maintain an open connection via streaming until the data returns from ISB B; can even email or fax the data if those methods are supported.

Step 3: Displaying or Otherwise Using the Records

As with records received from within the SNO, the current model is that it places no constraints on how the records are made useful, other than to require that the consuming entity abides by policy requirements around privacy, security, auditing, etc

3) Transfer of Data from an Entity in a SNO to an Unaffiliated Entity

A similar scenario can occur when an entity A, which is not part of any SNO, requests a patient's record held in SNO B.10 This scenario is critical for the network to grow organically, since the early days of any such network will necessarily cover only a minority of potential participants. The basic three-step transaction is the same as the transfer of data between SNOs, with these differences:

Step 1: Asking for Record Locations

The trust model between unaffiliated entities and SNOs assumes that any SNO accepting queries from unaffiliated entities will subject such requests to a high standard of scrutiny, and to higher levels of audit, and will in any case not automatically honor such requests without some form of scrutiny.

Step 2: Aggregating the Identified Records

Communication from an ISB to any outside entity is always asynchronous. As a result, any clinic asking for material through an ISB must either have an accessible online receiver for the results, or must have access to a third-party service that offers such a receiving service. The Common Framework is designed to allow for the creation of such third-party services, though in all cases, the sending and receiving parties are responsible for care of the patient's data, and will be liable for any loss occurring through third-party services they hire or subscribe to.

Step 3: Displaying or Otherwise Using the Records

As above, the current model is that it places no constraints on how the records are made useful, other than to require that the consuming entity abides by policy requirements around privacy, security, auditing, etc.

4) From an Entity in a SNO to a Public Health Institution or Other Aggregator

The 2004-2005 work on the Common Framework concentrated on clinical data. However, in addition to handling identified clinical records about individual patients, there are many reasons to handle aggregate and partially anonymized records, including satisfying public health reporting requirements, quality reporting, and fraud detection. This scenario is quite different from any transfer of clinical data, and is handled differently in the Markle Connecting for Health model.

Because the Record Locator Service contains no clinical data, aggregate and anonymized requests are not dispatched directly to the RLS, but instead to the individual institutions, which reply with those requests directly. It is currently up to the individual SNO, in negotiation with the entities who are allowed access to aggregate or anonymized data, to determine whether such requests should go through the ISB or should be handled as direct connections between the entities and the aggregators of the data. This allows the partitioning strategy for protection of data to continue to operate even when handling aggregate data, even when such requests are not governed by HIPAA, as with required public health reporting. Our model for direct aggregation from the source is the Shared Pathology Informatics Network (SPIN),11 and modeling of SPIN-style interactions in the Common Framework is part of the 2006 effort.

5) Subscription Models

Any of the above transactions may be modeled as a subscription to a particular source of data as well, where an authorized user can request that when a piece of remote content is updated, they receive either a notification of the update, or receive the updated data itself. However, this pattern is not yet specified. The Common Framework is based on Web Services, which enormously lowers the required coordination among network participants, both in advance of and during a transaction. As a result of this loose coupling, subscription models of data transfer (e.g., "Notify me when this patient has new lab results.") can be modeled in two ways. The first is 'scheduled pull'—scheduled requests from the querier to the RLS or data holder, which requests automatically repeat periodically, in intervals between seconds and months depending on the nature of the query. The other is 'triggered push', where the RLS or data holder watches for updates to data, and pushes out any such updates to a list of subscribers or their designated proxies.

The design and implementation of such models is complex and highly dependant on the technical savvy of the member entities of the SNO. A number of variables affect decisions about subscriptions, such as who assumes the costs of maintaining the subscription information (the querier, in the case of scheduled pull, and the holder of the data, in the case of triggered push.) As a result, like aggregation, the design and implementation of subscription models is currently envisioned as a per-SNO design choice, though with the assumption that observation of the various implementations in 2006 will provide a guide to any nationwide standardization.

Broad Policy and Technical Requirements

The Common Framework provides a list of the minimal set of policies and standards that must be adopted by any participating SNO. On the policy and governance side, all incorporated members of a SNO12 must:

  1. Adopt the policies of the Common Framework (See the policy documents contained in The Markle Connecting for Health Common Framework: Resources for Implementing Private and Secure Health Information Exchange.)
  2. Agree to any SNO-wide policies in place

In addition, each SNO has three technical services it must offer:

  • A SNO-wide Record Locator Service, to allow authorized entities within the SNO to look for patient data
  • A matching algorithm, to match patient demographics contained in incoming requests with the records stored in the Record Locator Service.
  • An Inter-SNO Bridge (ISB), to allow authorized outside parties to look for and retrieve patient data

The basic rationale behind these governance and technical requirements are discussed below; the detailed policy recommendations are contained in The Markle Connecting for Health Common Framework: Resources for Implementing Private and Secure Health Information Exchange; the detailed technical implementation guides are contained in the “Health Information Exchange: Architecture Implementation Guide”, contained in The Markle Connecting for Health Common Framework: Resources for Implementing Private and Secure Health Information Exchange.

A health care entity can belong to more than one SNO; this would of course entail the additional expense of listing patient demographics and record location information in more than one place, and reconciling contractual requirements where they differ between SNOs. There is no conceptual obstacle to multi-SNO membership, however. There is no minimum or maximum size for a SNO; a single institution can be a SNO so long as it adheres to the principles and standards of the Common Framework. In practice, only very large institutions will do this, as having a single institution as a SNO creates little of the efficiencies or cost-savings that multi-entity SNOs can have.

Software Requirements for RLS, Matching Algorithm, and ISB

One of the key design principles of the Common Framework is that no particular software application is required; in the same way that email software from different organizations all read the same email data standards, the technical infrastructure of a SNO can be built on any suitably secure hardware and software platform,13 so long as it produces and consumes common data standards.14

The three applications a SNO is required to host15 are the Record Locator Service, a matching algorithm for matching queries for clinical data with patient records, and an Inter-SNO Bridge, for traffic between the SNO and the outside world.

RLS

One of the basic software requirements of a SNO is the operation of a Record Locator Service (RLS.) The institutions with the right to list patient demographics and record locations in the RLS are the members of a SNO, by definition. Thus the RLS is the practical locus of most SNO-wide activity. The details of the RLS are covered in the “Health Information Exchange: Architecture Implementation Guide,” and the relevant policies in The Markle Connecting for Health Common Framework: Resources for Implementing Private and Secure Health Information Exchange, but the basic functions are described here.

The Common Framework makes the following assumptions about the design of the RLS:

  1. There is one RLS per SNO, which holds the universe of records that can be queried using the RLS service within that SNO.16 There is no meta-RLS, in keeping with the "No requirement for national services" design.
  2. The RLS is designed only for patient-centered queries. Aggregate queries (e.g., "Find all admissions in the last 24 hours presenting with shortness of breath") must be dispatched to the participating institutions, or run against aggregated databases that are collected and kept separately. The lack of clinical data at the RLS keeps the RLS from being a target of loss or theft of clinical data, and allows interactions to be optimized for a single, simple case.
  3. The RLS participates in two types of transactions—the addition, modification, or deletion of listed patient record locations from the entities that hold data on the patient, and requests for information about a particular patient from entities that want those locations.
  4. All transactions to and from the RLS are logged and audited.
  5. The RLS must have a valid SSL certificate, and may only communicate with requestors who support encrypted web communications (https).
  6. The RLS is designed to take a query from authorized users in the form of demographic details or, alternatively, a query in the form of a unique provider ID plus a Medical Record Number, which would enable them to use the RLS to find other records for the patient whose MRN they know.17
  7. The RLS must support patient data in incoming queries expressed in the HL7 2.4 format described in the “Health Information Exchange: Architecture Implementation Guide.”
  8. The RLS may support incoming queries expressed in the HL7 3.0 format described in the “Health Information Exchange: Architecture Implementation Guide.”
  9. The RLS must support both synchronous queries, where the data is returned in a single round trip, and asynchronous queries, where the data is delivered in a new session, some time after the original query. The querier may request either synchronous or asynchronous queries; the RLS may also default to asynchronous return of results if it is unable to complete a given query in a timely fashion.
  10. The RLS must implement a probabilistic matching algorithm for patient queries so that the chance of incidental disclosure (presenting a false match) is minimized. (See the policy document “Correctly Matching Patients with Their Records.”)
  11. In responding to such queries, the RLS will return zero or more matching demographic records, each including a locator (usually an Institution code and a Medical Records Number) to a set of clinical data for that patient. The locator contents are used for subsequent queries for clinical data.
  12. The RLS will return only records which meet or exceed a minimum probability level. (See the policy document “Correctly Matching Patients with Their Records.”)
  13. The RLS will not provide a “Break the Glass” procedure in which a physician or other inquirer can request an emergency exception to allow examination of records below the minimum probability level. Besides having a high probability of incidental disclosures and false positive matching, there is no logical additional method that the inquirer can use to positively identify the correct record. If a user has certainty that a record related to a specific patient exists at a particular entity, that user should work directly with that entity to attempt to locate the record.
  14. The RLS will return "as matched" data for any data transformations it performed in matching the data (e.g., noting that it matched a name provided as Elizabeth with a patient whose first name is listed as Liz.)18
  15. The RLS should not return demographic data in fields not submitted by the querier. The RLS may well have demographic details about a patient that a clinician has not submitted; these details should not be displayed, to avoid incidental disclosure, and the risk of authorized users fishing for data.
  16. The SNO must maintain a logical separation of clinical from demographic (identifying) data. The RLS itself will not hold clinical data or metadata; all of that is controlled by the entities that created the data, or who hold copies because they provide the patient with care.
  17. The design of the RLS assumes that the clinical data itself may be served from cached or other copied versions of the "live" clinical data, and it is acceptable to centralize the physical storage of this data, in order to control costs and guarantee service levels. However, wherever it is located, the data itself should remain in the control of the providing institution, which should be deferred to as the final source of truth on issues of data accuracy and cleanliness.
  18. At the time or shortly after records are published to the RLS from entities, the RLS must report obvious errors back to the publishing entity. Such errors include but are not limited to non-numeric characters or incorrect number of digits in numeric data such as SSN, day, month and year designations in Date of Birth; dates that are out of range (e.g., February 31); etc. The requestor is not required by the Common Framework to act on these reports, but the RLS must make them available, and the individual SNO may have a policy requiring particular responses to such errors.
  19. At the time of publishing records to the RLS from an entity, the RLS may report possible errors, including but not limited to name and gender fields with a high probability of inconsistency (e.g., Sylvia, M), pairs of records above the matching threshold with different dates of birth; patient records above the matching threshold with different local record numbers, etc. The publishing entity is not required by the Common Framework to act on these reports, but the RLS must make them available, and the individual SNO may have a policy requiring particular responses to such errors.
  20. The RLS must be able to provide an audit log indicating all entities that have published records on behalf of an individual patient and all users that have received record locations in response to requests regarding an individual patient.

Adoption of a Matching Algorithm

The RLS stands between authorized queriers (either entities within a SNO, including possible aggregation services, or the ISB) and a database of patient demographics and record locations. The RLS's job is to take the incoming queries, format the contents of the message, and make a query to a matching algorithm that determines which records in the database are likely matches. Those records, and only those records, are returned by the matching algorithm to the RLS. The policies governing the matching algorithm are covered in “Correctly Matching Patients with Their Records.”

There is not any standard matching algorithm that can be adopted nationwide, because the work on matching is highly sensitive to local regulations, as with regions that forbid the use of Social Security Numbers (SSNs) for matching and to the relative “cleanliness” of the underlying data. The more accurate the collection and storage practices are, the more likely that highly accurate matching will occur with fewer fields. (See “Background Issues on Data Quality” in The Markle Connecting for Health Common Framework: Resources for Implementing Private and Secure Health Information Exchange for a discussion of this issue.)

Such algorithms are also highly sensitive to local characteristics of the data set being queried. A last name of Smith is a better predictor of uniqueness in Wewoka, OK than Waltham, MA; NYSIIS-transformed19 names are better matches for Anglo-Saxon names than French names; and so on. The adoption of a matching algorithm that satisfies the conditions below is a nationwide requirement; the nature and tuning of the particular algorithm must be left to the SNO itself.

The Common Framework makes the following assumptions about the matching algorithm:

  1. The algorithm itself is not specified; each SNO is free to use and tune any algorithm that meets the below criteria. Two of our prototype sites have made their matching algorithms available as part of this release, Indianapolis and Mendocino County. Boston uses a commercially available product, as do many existing health care systems.
  2. Authorized queriers present a set of demographic details, and receive in return zero or more matching record locations. Only records meeting a minimum level of probability should be returned. That minimum level is calculated at each RLS such that the probability of returning a false positive match is very low (e.g., one chance in 100,000). Matches approaching but not reaching that level (sometimes called “fuzzy matches”) should not be returned to avoid incidental disclosures. Further, the querier should not be told which data elements do not match since that could encourage fishing. It is legitimate to suggest that the querier provide additional data fields if these were not provided in the initial query. The details of how those matches are calculated must be hidden from the querier by the RLS, to preserve the ability of SNOs to use different and selectively tuned matching algorithms while maintaining standard interfaces.
  3. Should the algorithm use transformations of the presented demographic data (e.g., treating Maggie and Margaret or off-by-one errors in numerical data as approximate matches) then the data returned should indicate both the fact of the match and the fact that a transformation was used in the match. The details of the display are up to the receiving application, but the information is provided to allow the requester, possibly in conversation with the patient, to add a check step against false positives, which are possible even with a high probability match.
  4. Because delivering too little information is far less dangerous than delivering the wrong information, the algorithm must be tuned to minimize false matches, even at the expense of increasing the number of failed matches (false negatives). The algorithm must meet the policy requirements for accuracy, currently described in “Correctly Matching Patients with Their Records.”
  5. A national health identification number is not required. Demographic matching can work at population scale, without triggering either the enormous expense or political risk of failure that will attend any work on unique patient IDs. Should such an identifier exist, however, its use would still require the mechanics for matching and record location created by the RLS.20 Social security number, although far from perfect as an identifier, and other types of identifying numbers, can increase the probability of achieving a correct match.
  6. Individual SNOs may allow or require local IDs to be used as identifiers for the RLS (e.g., a SNO in a region with a primary employer may add employer IDs to the criteria to be matched.)
  7. If there are records in the RLS that are below the matching threshold, the querier may not be presented a list to choose from, as this would create the very incidental disclosure the algorithm must be designed to avoid. (This restriction also forbids "wild card" searching, disallowing a search for e.g., all patients with the last name Smith.) Instead, the querier may be offered the ability to provide additional demographic details.
  8. The RLS cannot assure that all records that exist for a given patient will be located, even in principal, because the patient may have received care outside the SNO, and because even within the SNO, there may be records that refer to the patient but are beneath the matching threshold, or that are being kept confidential for reasons of patient preference or legal constraints from the State or local policies set by the SNO or participating entities. The SNO may, at its discretion, require that displays of results returned from the RLS contain a reminder that the data may only be partial.

ISB

The other application a SNO is required to host is the Inter-SNO Bridge (ISB.) The ISB is the interface to data held by a SNO but used by institutions outside the SNO. It serves as a single point of access for all remote queries to entities inside any given SNO. The technical details of the ISB are in the “Health Information Exchange: Architecture Implementation Guide.” The relevant policies are contained in The Markle Connecting for Health Common Framework: Resources for Implementing Private and Secure Health Information Exchange.

The Common Framework makes the following assumptions about the design of the ISB:

  1. There is one ISB per SNO, which handles all per-patient clinical requests coming from or going to that SNO.
  2. The ISB is only for patient-centered queries. Aggregate queries (e.g., "Find all admissions in the last 24 hours presenting with shortness of breath") should be dispatched to the participating institutions, or run against aggregated databases that are collected and kept offline. The lack of centralized clinical data keeps the ISB from being a target of loss or theft of clinical data, and allows interactions to be optimized for a single, simple case.
  3. All transactions to and from the ISB are logged and audited.
  4. The ISB must have a valid SSL certificate, and may only communicate with requestors who support encrypted web communications (https).
  5. The ISB, like the RLS, is designed to take a query from authorized users in the form of demographic details or, alternatively, a query in the form of a unique provider ID plus a Medical Record Number.
  6. The ISB must support patient data in incoming queries expressed in the HL7 2.4 format described in the “Health Information Exchange: Architecture Implementation Guide.” The ISB may support incoming queries expressed in the HL7 3.0 format described in the “Health Information Exchange: Architecture Implementation Guide.”
  7. The ISB must support two possible patterns of request. The 'one pass' pattern has the requestor presenting patient details and receiving back the aggregated records. In this case, the ISB has acted as the aggregator, as described in Step 2 of the section Modeling Interactions, above.
  8. The other pattern that must be supported is a 'two pass' interaction, in which the ISB acts like a standard RLS, returning locators to the remote querier, who then replies with a list of records they would like access to.
  9. The ISB must support asynchronous delivery of records, where the requestor, whether a remote ISB or other entity, sends a request in, and then makes available a server for later delivery of the results of the request. "Later" may only be measured in seconds, but the asynchronous pattern is important because there is no guarantee that the ISB can dispatch and resolve all the required transactions local to the SNO quickly enough to support a synchronous return.