CRC 1991 - colorectal cancer - data structure and protocols

[Return to the Contents List]

Contents List

CRC 1991 data preparation: brief written protocol

Strict confidentiality of trial results is observed. Information is held in the Clinical Trial Service Unit computers in a form which can be accessed only by known individuals.

All patient records are converted into 'green form' format (described below) if not already supplied in it. Results received as tables are converted into sets of synthetic 'green form' records. The following routine checks (where appropriate) are performed on every 'green form' compilation:

The total numbers of patients and the distributions of randomisation age, tumour site, tumour stage and gender are checked for any significant imbalance between treatment groups. These four distributions are compared as follows. Patients are grouped into four categories according to randomisation age (below 50 years; 50 - 64 years or unknown; 65 - 74 years; 75 years or above) and a chi-squared test is applied to the population of the three categories found in each treatment group. Similarly, three categories are formed for tumour site (colon; (colon plus rectum) or unknown; rectum), five for tumour stage ('other' or unknown; A; B; C; D) and three for gender (male; unknown; female) and these are tested in the same way as the categories formed for randomisation ages.

If an event such as the recurrence of disease is reported at a date later than the quoted last follow-up date, the last follow-up date is automatically changed to the later date. The completeness of follow-up is then calculated for the end of each calendar year. The distributions of randomisation dates, randomisation ages and time elapsed since last follow-up are checked for any significant imbalance between treatment groups in two ways as follows. Firstly, a t-test is applied to the difference between the mean value of each distribution for patients in each group with the corresponding mean for patients in the remainder. Secondly, an F-ratio is calculated for each distribution by comparing the variance between the groups with the variance within the groups. The distribution of time elapsed since last follow-up is also checked in these two ways for any significant imbalance between those patients with and those patients without a recorded recurrence of disease. Finally, the distribution of time elapsed since last follow-up is checked in the same two ways for any significant imbalance between patients in two categories of tumour site (colon; rectum), two categories of tumour stage (A/B; C/D) and two categories of gender (male; female).

Where patient serial numbers form an obvious sequence it is checked for missing numbers.

A tabulated breakdown of variables is produced for each trial, together (where relevant) with lists of patients in 'problematical' categories such as those with lapsed follow-up, uncertain death cause or second malignancy site. Graphs of accrual date and the proportion of living patients still on follow-up as a function of time from randomisation by treatment allocation are also produced, together with Kaplan-Meier life-table curves. Before trial data are finally incorporated into the overview, the analyses described above are sent to the participating trialist(s) for checking and approval.


Please address inquiries concerning data preparation and checking to:

Specification of CRC 1991 'green form' format

Item Description FORTRAN Columns Details Abbreviation
Trial/stratum identifying code  I6  1 - 6  Trial 
Patient identifier (or sequence number)  A12  8 - 19  Patient 
Randomisation date  I6  21 - 26  DDMMYY  'Entry Date' 
Treatment group allocated (as on master list)  I1  28  Trt. Grp 
Surgery date  I6  30 - 35 
Value Description Abbreviation
-1  No surgery 
-2  No surgery, NOT on account of disease stage 
-3  Surgery but date unknown 
-4  Too ill for surgery 
Surg. Date 
(Not used) 
Tumour site  I1  39 
Value Description Abbreviation
Colon  Colon 
Rectum  Rectum 
Colon and rectum  Col+Rect 
Tum. Site 
Tumour stage  A2  41 - 43 
Value Description Abbreviation
B1  B1  B1 
B2  B2  B2 
B3  B3  B3 
C1  C1  C1 
C2  C2  C2 
C3  C3  C3 
Metastatic disease 
D?  Metastatic disease  D?
Not colorectal cancer 
'Advanced/metastatic disease' 
Benign tumour 
Inoperable disease 
Y?  Inoperable disease  Y? 
Malignant tumour (unclassified) 
Tumour Stage 
Gender  I1  44 
Value Description Abbreviation
Male  Male 
Female  Female 
Randomisation age  I2  46 - 47  years  Age 
10  Recurrence  I1  49 
Value Description Abbreviation
No  No 
Yes  Yes 
11  Date of first recurrence  I6  51 - 56  DDMMYY Rec. Date 
12  Type of first recurrence  I2  57 - 58 
Value Description Abbreviation
Local only  Local 
Local and distant, liver unknown  L+D,?Hep. 
Distant only, including liver  Dist+Hep 
Distant only, excluding liver  Dist,NoHep 
Distant only, liver unknown  Dist,?Hep 
Distant, but local unknown  Dist,?Loc 
Local and distant, including liver  L+D+Hep. 
Local and distant, excluding liver  L+D,NoHep 
Local, but distant unknown  Loc+?Dis 
10  Unknown, liver sometime  ??+Hep. 
11  Unknown, but not liver  ??,NoHep 
12  Unknown  ?? 
Type Rec 
13  State when last traced  I1  60 
Value Description Abbreviation
Alive  Alive 
Dead  Dead 
Lost  Lost 
14  Date died or last traced  I6  62 - 67  DDMMYY  L.F.U. 
15  Cause of death (extra category)  I2  68 - 69 
Value Description Abbreviation
Acute iatrogenic 
Leukaemia, lymphoma or myeloma 
Other second neoplasm 
Venous embolism 
Extraneous cause 
Not 1-8,13-18 or colorectal cancer 
10  Unspecified non-colo.-ca. cause 
11  Colorectal cancer or its mets. 
12  Unascertainable cause 
13  Renal failure 
14  Bowel fistula / ulcer 
15  Intestinal obstruction 
16  Probably not colorectal cancer 
17  Liver failure 
18  Gastrointestinal haemorrhage 
19  Second primary colorectal cancer 
16  Name (if given) and comments  71 - end  Name 
Missing or unknown items are left blank or set to zero.

CRC 1991 data form rubric



Special coding conventions


Patient identifier



CRC 1991 data preparation diktats

1 Local spread found at surgery

2 Incomplete excision / residual tumour after surgery

3 Local recurrence at surgery

4 Local recurrence reported on days 1-30 after surgery

5 Metastases at surgery

6 Metastases reported on days 1-30 after surgery

7 Recurrence at unknown site at surgery

8 Recurrence at unknown site reported on days 1-30 after surgery

9 Conversion into Dukes system

T N M Dukes
Tis 0 0 X
1-2 0 0 A
3-4 0 0 B
Any >0 0 C
Any Any >0 D
Adenoma X ('benign')

Note on simultaneity of events

Note on coding of relapse sites

[Return to the Contents List]

[End of document, updated to 29 November 2000]