HD 1991 - Hodgkin's disease - data structure and protocols

[Return to the Contents List]

Contents List

HD 1991 data preparation: brief written protocol

Strict confidentiality of trial results is observed. Information is held in the Clinical Trial Service Unit computers in a form which can be accessed only by known individuals. All patient records are converted into 'pink form' (1991) format (described below) if not already supplied in it. Results received as tables are converted into sets of synthetic 'pink form' records. The following routine checks (where appropriate) are performed on every 'pink form' compilation: The total numbers of patients and the distributions of randomisation age, disease stage, systemic symptoms and primary treatment outcome are checked for any significant imbalance between treatment groups. These four distributions are compared as follows. Patients are grouped into three categories according to randomisation age (below 30 years; 30 - 49 years or unknown; 50 years or above) and a chi-squared test is applied to the population of the three categories found in each treatment group. Similarly, five categories are formed for disease stage (I; II; III; IV; unknown), three for systemic symptoms (absent; present; unknown) and three for primary treatment outcome (complete remission; other failures; unknown) and these are tested in the same way as the categories formed for randomisation ages.

If an event such as the recurrence of disease is reported at a date later than the quoted last follow-up date, the last follow-up date is automatically changed to the later date. The completeness of follow-up is then calculated for the end of each calendar year. The distributions of randomisation dates, randomisation ages and time elapsed since last follow-up are checked for any significant imbalance between treatment groups in two ways as follows. Firstly, a t-test is applied to the difference between the mean value of each distribution for patients in each group with the corresponding mean for patients in the remainder. Secondly, an F-ratio is calculated for each distribution by comparing the variance between the groups with the variance within the groups. The distribution of time elapsed since last follow-up is also checked in these two ways for any significant imbalance between those patients with and those patients without a recorded recurrence of disease. Finally, the distribution of time elapsed since randomisation is checked in the same two ways for any significant imbalance between patients in four categories of disease stage (I; II; III; IV), two categories of systemic symptoms (absent; present) and two categories of primary treatment outcome (complete remission; other failures).

Where patient serial numbers form an obvious sequence it is checked for missing numbers.

A tabulated breakdown of variables is produced for each trial, together (where relevant) with lists of patients in 'problematical' categories such as those with lapsed follow-up, uncertain death cause or second malignancy site. Graphs of accrual date and the proportion of living patients still on follow-up as a function of time from randomisation by treatment allocation are also produced, together with Kaplan-Meier life-table curves. Before trial data are finally incorporated into the overview, the analyses described above are sent to the participating trialist(s) for checking and approval.


Please address inquiries concerning data preparation and checking to:

Specification of HD 1991 'pink form' format

Item Description FORTRAN Columns Details Abbreviation
Trial/stratum identifying code  I6  1 - 6  Trial 
Patient identifier (or sequence number)  A12  8 - 19  Patient 
Gender  I1  21 
Value Description Abbreviation
Date of birth  I6  23 - 28  DDMMYY  Birth Date
Laparotomy  I1  30 
Value Description Abbreviation
No  No 
Yes  Yes 
Disease stage  I1  32 
Value Description Abbreviation
Systemic symptoms  I1  34 
Value Description Abbreviation
Absent  No 
Present  Yes 
Date of initial treatment/diagnosis  I6  36 - 41  DDMMYY  1-Trt. Date 
Randomisation date  I6  43 - 48  DDMMYY  Entry Date 
Treatment group allocated (as on master list)  I1  50  Trt. Grp 
10  Primary treatment outcome  I1  52 
Value Description Abbreviation
Complete remission  CR 
Other failures  Ofail 
Prim. Outc 
11  Outcome evaluation date  I6  54 - 59  DDMMYY  Outcome Date 
12  Recurrence  I1  61 
Value Description Abbreviation
No  No 
Yes  Yes 
13  Date of first recurrence  I6  63 - 68  DDMMYY  Rec. Date 
14  State when last traced  I1  70 
Value Description Abbreviation
Alive  Alive 
Dead  Dead 
Lost  Lost 
15  Date died or last traced  I6  72 - 77  DDMMYY  L.F.U. 
16  Cause of death (extra category); there is also a 'pre-diktat' Cause of Death item, with the same coding, in form columns 79-80  I2  82 - 83 
Value Description Abbreviation
Iatrogenic  Iatro 
Infective  Infec 
Leukaemia  Leuk 
Solid neoplasm  Neop2 
Cardiovascular  CardV 
Extraneous cause  Extra 
Not 1-5,8,13,14 and not Hodgkin's dis.  NotHD 
10  Not Hodgkin's disease  NotHD 
11  Hodgkin's disease  HD 
12  Unascertainable cause  Unkn. 
13  Pulmonary  Pulmo 
14  Non-Hodgkin's lymphoma  NonHL 
17  Name (if given), cause of death and comments  85 - end  Name 
Missing or unknown items are left blank or set to zero.

HD 1991 data form rubric



Special coding conventions


Patient identifier


Primary treatment outcome

[Return to the Contents List]

[End of document, updated to 1 September 2000]