CLINICAL
IMMUNOLOGY
AND
IMMUNOPATHOLOGY
52, 3-9 (1989)
Report of the Workshop on the Evaluation of T-Cell Subsets during HIV Infection and AIDS’ PAMELA
G. KIDD*
AND ROBERT
F. VOGT,
JR.?
*Department of Laboratory Medicine, University of Washington, Seattle, Washington 98195 and fDivision of Environmental Health Laboratory Sciences, Centers for Disease Control, Atlanta, Georgia 30329
INTRODUCTION A symposium and workshop entitled “Evaluation of Immune Response during HIV Infection and AIDS: T Cell Subsets” took place in San Francisco on November 3, 1988, preceding the Third Annual Conference on Clinical Immunology Sponsored by the Clinical Immunology Society (CIS). The symposium and workshop was chaired by John L. Fahey, M.D., and was organized by CIS, the National Institute of Allergy and Infectious Diseases (NIAID) AIDS program, and the Center for Interdisciplinary Research in Immunology and Disease (CIRID) at UCLA. The morning symposium drew together leaders in the field of T-cell subset research to share information, ideas, and experience in subset testing and quality assurance. The morning symposium was followed by an afternoon 2-hr workshop in which participants were divided into two groups to allow for discussion of issues raised by the morning presentations. One group was cochaired by Janis Giorgi and Daniel Stites with panelists Helene Paxton, Janet Nicholson, and Rebecca Gelman. The other group was cochaired by Alan Landay and Dorothy Lewis with panelists Mary Ann Fletcher, William Rickman, and John Koepke. The chairpersons developed a common agenda addressing two general questions: (1) what markers should be included when immunophenotyping leukocytes from HIVinfected patients? and (2) What are the technical issues that need to be addressed to assure that the results of the tests are valid? Both groups reconvened for the final hour of the workshop to share ideas, concerns, and issues raised during the individual discussions. SELECTION
OF SURFACE
MARKERS
FOR IMMUNOPHENOTYPING
The participants discussed several factors which may affect the selection of surface marker tests. These include quality assurance considerations, the patient and control populations under study, the anticipated affects of therapeutic agents, both anti-viral drugs and biological response modifiers (BRM), and the purposes for which the test results are to be used. The purposes for immunophenotyping 1 Presented as part of a symposium entitled “Evaluation of the Immune System during HIV Infection and AIDS: T Cell Subsets,” November 3, 1988, San Francisco, CA. 3 0090-1229/89 $1.50 Copyright 0 1989 by Academic Press, Inc. AU rights of reproduction in any fom reserved.
4
KIDD
AND
VOGT
may include diagnosis, prognosis, assessing the efficacy of treatment, and/or searching for clues to pathogenesis. Although T-cell subsets were used for diagnostic purposes before the advent of more specific and sensitive tests, all discussants agreed that T-cell subsets should not be used for diagnosing the presence of HIV infection. In contrast, discussants agreed that selected surface markers should be used in seeking information about prognosis or therapeutic responses to anti-viral agents. An underlying assumption to the discussion of selection of surface markers for phenotyping was that immunophenotyping panels provide information about numbers of cells in different lineages and states of differentiation but cannot provide direct information about the function of those cells. There was uniform consensus that the measurement of circulating CD4 lymphocytes should be included as part of any panel testing HIV-positive patients. There was less consensus on the value of including other markers, particularly with consideration of the costs involved. Concerns over quality assurance may justify routinely enumerating total T-cells, total B-cells, and natural killer (NK) cells to account for all lymphocytes (discussed below). Beyond this, the questions of usefulness and cost effectiveness of testing for other markers remains open. Several groups have explored the use of antibodies to antigens that further characterize CD4 and CD8 subsets. Using two- or three-color analysis, efforts have been made to identify those subsets that may correlate with prognosis or therapeutic responses, as presented in the morning symposium. Markers that further divide CD4 lymphocytes into inducers of help and inducers of suppression have been investigated in several studies, with some evidence suggesting a greater decrease in the inducer of help subset (CD4 + 4B4 + ). However, most discussants did not feel that these CD4 subset measurements add significantly to the information gained from the measurement of CD4 alone. Activation markers on CD8 lymphocytes (CD8 + DR + , CD8 + CD38 + ) appear more promising, although they have not been investigated extensively. Rates of change or ratios of certain subset markers may provide more sensitive indicators of disease status than individual subset measurements alone. A recurrent point in these discussions was that much of the data already accumulated remain to be analyzed in terms of sensitivity, specificity, and predictive value. The general consensus was that data from smaller research protocols should be analyzed as thoroughly as possible so that promising findings could be tested in larger studies. TECHNICAL
ISSUES
AND QUALITY
ASSURANCE
The participants discussed a number of issues in response to concerns over getting the “right” answer from flow cytometric analysis. A genuine assessment of accuracy is not possible without true reference standards, which do not exist for any of the formed elements of blood. The only available indicator of accuracy is therefore the degree of consensus on samples analyzed in different laboratories. Since analytical results can be strongly influenced by differences in methodology
T-CELL
SUBSET
WORKSHOP
and technique, there is a need for methodological following concerns: Sample collection, preservation, and storage; selection of monoclonal antibodies; instrument cytometric analysis; data analysis; hematologic (WBC) and differential counts; reproducibility and certification.
5
guidelines
that address the
sample preparation and staining; calibration/standardization; flow analysis for total white blood cell and accuracy; operator training
Sample collection, preservation and storage. Discussants agreed that, ideally, all samples would be drawn at the same time of day, preserved at room temperature in EDTA, and analyzed within 4 hr. Deviations from this approach should be documented to give comparable results within each laboratory. The time of day when a specimen is drawn may affect results due to diurnal variation, especially in the total WBC and the percentage of lymphocytes. The percentage of subsets are apparently not as affected by the time of day and are more stable from day to day than the WBC and the percentage of lymphocytes. There was discussion that the subset percentage number might be a better value to use in stratifying patients for treatment protocols as it would not be as subject to such large variations as the absolute number (a number calculated from the WBC and lymphocyte count). EDTA, heparin, and ACD were all considered appropriate anticoagulants for samples to be analyzed by whole blood lysis. Some participants using whole blood methods preferred heparin or ACD for specimens analyzed 24 hr or more after being drawn, while others felt that EDTA was as good or better for both short- and long-term storage. It was noted that EDTA has the advantage of allowing total WBC and differential counts on the same sample. Room temperature storage was generally accepted as standard, although reports of changes in CD4 counts caused by refrigerated storage may not be true for the newer whole blood techniques. Samples should be prepared within 24 hr of collection, preferably sooner. Sample preparation and staining. Most discussants agreed that whole blood lysis techniques give more consistent results than density gradient separation of cells, which may result in selective loss of some subsets. Laboratories with extensive experience in separating cells may achieve the same levels of consistency; however, this method is technically more demanding. If samples to be prepared by density gradient separation cannot be processed within 4 hr of collection, they should be diluted with an equal volume of cell culture media containing dextrose. Whole blood techniques require less volume but do not allow for cyropreservation of leukocytes for future study. In whole blood procedures, erythrocytes may be lysed before or after incubation with labeled antibodies, and there was no evidence that different results were obtained with different methods of whole blood lysis. It was agreed that fixation of cells with paraformaldehyde following staining and lysing was desirable for biosafety reasons. Selection of monoclonal antibodies. The selection of antibodies to be used in examining specimens from HIV-positive patients is addressed in a philosophical
6
KIDD
AND
VOGT
manner in the first section. There are various clones from commercial companies and from individual researchers available for use. No judgment was made as to the choice of clones to be used. Beyond the biological considerations, it was suggested that a minimal panel of surface markers be used to document the integrity of the final answers for all subsets. This panel should include markers for total T-cells (CD2 or CD3), total B-cells (CD19 or CD20), NK cells (Leu 19 or NKH-I), total leukocytes (CD45), myelomonocytic leukocytes (CD13 or CDlS), and isotype controls. Total T-cells + total B-cells + NK cells should approximate 100%. This is not always true or is less predictable in abnormal specimens where one may encounter unusual populations of cells. However, this remains a quick internal way of checking an individual specimen. Instrument calibration/standardization. All participants agreed that samples should be analyzed under instrument conditions that are optimized, consistent, and well documented. Forward and right angle light scatter settings should allow the clear visualization of all leukocyte populations. Fluorescence settings should allow some visualization of the nonspecific fluorescence in the unstained cells and clear discrimination of the stained cells. Once these conditions are determined on prepared leukocytes, consistent calibrators should be used to reproduce them and to monitor instrument performance. The ideal scheme would include a “key bead” with a highly stable fluorochrome, a set of standardized calibration beads with measured amounts of the same fluorochrome as the stained cells, and a stabilized cellular material (such as fixed calf thymocyte nuclei) also stained with the same fluorochrome. The key bead provides the best indicator of instrument consistency, the calibration beads provide a standard curve that can convert histogram channels into gravimetric equivalents of labeled antibody (an indicator of antigen density), and the cellular material provides the best indicator of consistency for light scatter and fluorescence intensity of stained leukocytes. All three materials can be included in a single “calibration cocktail” that is used to adjust laser power and PMT voltages for the predetermined optimal values. Records of the instrument settings and histogram parameters from the calibration materials would provide a complete account of daily cytometer performance. These areas of concern will be dealt with in detail in the National Committee for Laboratory Standards (NCCLS) provisional report to be issued in early 1989. Flow cytometric analysis. Gating (drawing of a bitmap) around the population of cells to be analyzed is an art as well as a science. Understanding the specimen to be examined and anticipation of the information desired will direct the choice of the population to be studied. Ideally, the raw data from sample analysis would be accumulated with completely “open” gates and stored in “list mode” files, where it could be played back if necessary under different gating constraints. The light scatter gate parameters should be adjusted to include the maximum possible number of lymphocytes with minimal contamination by other cells or debris. Any changes required in the light scatter gate parameters for a particular sample should be noted. Samples showing significant contamination that cannot be excluded by sating should be prepared and stained again; if the contamination is still evident the sample should be discarded. If significant numbers of lymphocytes lie outside the normal light
T-CELL
SUBSET
WORKSHOP
7
scatter gates, they should be independently characterized for surface marker phenotype. Discussants generally felt that ~5% contamination of the lymphocyte gate or 65% lymphocytes outside the gate was acceptable and that levels beyond that required reanalysis, arithmetic correction, or repeat preparation and staining. Dutu analysis. Data obtained from the flow cytometer should be checked for internal consistency on each sample. The gated lymphocyte population should include almost all lymphocytes and exclude almost all other events, as discussed above. All specimens should be run with isotype controls. In analyzing data, the cursor should be set on the isotype controls so that between 0.5 and 2% of the negative cells, as defined by the isotype control, fall in the positive region. If the specimen is not optimal or if the population of cells being studied is marking dimly, then some judgement is necessary in defining the positive cells. Optimally, results should be accumulated in a database format that allows easy access for retrieval and statistical analysis. Total white blood cell count (WBC) and leukocyte differential. The ideal method of measurement of these values would include WBC and differential as part of the flow cytometric analysis, but participants felt that this technology requires further development and assessment. The current alternative of choice is an automated cell count and differential performed in a clinical laboratory on a fresh specimen analyzed with proper quality control. Manual cell counts and differentials are subject to a high degree of imprecision and should not be used. The percentage of subset counts determined entirely on the flow cytometer have proven much more reproducible than absolute subset counts that must be calculated from the total WBC and differential. Several participants felt that the current use of absolute counts should be reexamined and perhaps replaced by the percentage of counts, not only because of the diurnal variation of the WBC and lymphocyte count, as discussed above, but also for reasons of reproducibility. Such reexamination could be done retrospectively since the percentage of counts should be available on all results expressed in absolute counts. Reproducibility and accuracy. A fair amount of time was spent on the question “How do you know you are getting the right answer?” The answer to that question is not simple since there is no “gold standard” against which flow cytometric data can be measured. One lesson that emerged from the multicenter trials was that a starting point was for individual laboratories to be able to obtain reproducible values on the same specimen tested multiple times. It was agreed that labs should be able to obtain values within an absolute 5% of each other on duplicate specimens (i.e., with a CD4 value of 42%, the second CD4 value, on the same specimen, should be between 42 and 47% or between 37 and 42%) and that reproducibility studies should be done once a month in a blinded manner. Participation in an interlaboratory quality assurance program, with appropriate feedback to the laboratory, offers another useful way of assessing how the values one laboratory obtains compare to values obtained by other laboratories on the same specimen. A program in which monthly specimens are sent, with prompt feedback to laboratories on their performance, is optimal. There was no consensus on the value of routine analysis of “normal” samples.
8
KIDD
AND
VOGT
The best argument for performing lymphocyte subset testing regularly on a set panel of people was to enable the laboratory to identify possible trends in the values obtained for various CD groups. Overall, most participants felt that the common use of standardized methods and instrument calibration would add to the value of the information obtained through flow cytometric analysis of leukocytes from HIV-infected individuals. Uniform approaches would also be advantageous when immunophenotyping is used in studies of other infectious diseases, immune disorders, and environmental exposures to immunotoxicant agents. Flow cytometer operator training and certi’cation. It was agreed that the operation of a flow cytometer requires special training and experience. Ways of obtaining training include: (1) courses taught by the flow cytometer manufacturers, (2) continuing education courses offered by universities, industry, and professional societies, and (3) apprenticeships in experienced laboratories. Participants who reported their experience with training programs emphasized the need for exposure to both the theoretical and the practical aspects of flow cytometry. Laboratory training was best when the groups were small, the trainees were at least somewhat familiar with their instruments, and at least 3 to 5 days could be spent in the lab. Cross-fertilization through communication among the trainees was an important benefit. It was generally agreed that it takes approximately 1 year of experience to become proficient in the operation of a flow cytometer and the analysis of the data. At this time, no certification program exists for flow cytometrists. The Society for Analytical Cytometry (SAC) has a committee on certification which is working with the American Society of Clinical Pathology (ASCP) to develop the criteria for certification in this area. As such a program is developed, emphasis should be placed on education, laboratory experience and skills, and a high degree of cognitive judgement, much of which comes only through experience. SUMMARY
OF WORKSHOP
DISCUSSION
1. There is no clear evidence at this time that lymphocyte subset measurements, other than CD4, provide prognostic value in HIV infection. However, there are several populations of activated CD8 cells which need further investigation. 2. Internal quality assurance argues for routine measurement of CDS, total T-cells (CD2 or CD3), and total B-cells (CD19 or CD20) in addition to 04. 3. Data on lymphocyte subsets from many clinical studies remain to be thoroughly analyzed. Researchers are urged to examine information from these studies as expeditiously as possible so that markers that may be clinically useful can be explored in larger studies. 4. Because of the diurnal variation and the inherent coefficient of variation in measuring WBC and lymphocytes, discussants felt that the percentage of lymphocyte subsets (specifically CD4) may be a better number to use than the calculated absolute number in following patients or stratifying patients for treatment protocols. 5. As the use of flow cytometry has moved from research areas to the clinical
T-CELL
SUBSET
WORKSHOP
9
laboratory, the need for methodologic guidelines has become apparent. Technical and quality assurance issues were discussed at length with some general agreement. Most of these issues will be addressed in detail in the NCCLS provisional report on flow cytometry to be issued in 1989. 6. With an increasing demand for qualified flow cytometrists, there is recognition of the need for appropriate training. There was strong sentiment to encourage SAC and ASCP to pursue the possibility of special certification in this area. Received January 24, 1989; accepted February 9, 1989