BC 1995 - breast cancer - data structure and protocols

[Return to the Contents List]

Contents List

BC 1995 data preparation: brief written protocol

Strict confidentiality of trial results is observed. Information is held in the Clinical Trial Service Unit computers in a form which can be accessed only by known individuals.

All patient records are converted into 'pink form' (1995) format (described below) if not already supplied in it. Results received as tables are converted into sets of synthetic records. The following routine checks (where appropriate) are performed on every compilation:

The total numbers of patients and the distributions of randomisation age, menopausal status, axillary nodal status, oestrogen receptor status and progesterone receptor status are checked for any significant imbalance between treatment groups. These five distributions are compared as follows. Patients are grouped into three categories according to randomisation age (below 50 years; 50 - 69 years or unknown; 70 years or above) and a chi-squared test is applied to the population of the three categories found in each treatment group. Similarly, three categories are formed for menopausal status (pre- or perimenopausal; unknown; postmenopausal), for axillary nodal status (negative; unknown; positive), for oestrogen receptor status (poor; unknown; positive) and for progesterone receptor status (poor; unknown; positive) and these are tested in the same way as the categories formed for randomisation ages.

If an event such as the recurrence of disease is reported at a date later than the quoted last follow-up date, the last follow-up date is automatically changed to the later date. The completeness of follow-up is then calculated for the end of each calendar year. The distributions of randomisation dates, randomisation ages and time elapsed since last follow-up are checked for any significant imbalance between treatment groups in two ways as follows. Firstly, a t-test is applied to the difference between the mean value of each distribution for patients in each group with the corresponding mean for patients in the remainder. Secondly, an F-ratio is calculated for each distribution by comparing the variance between the groups with the variance within the groups. The distribution of time elapsed since last follow-up is also checked in these two ways for any significant imbalance between those patients with and those patients without a recorded recurrence of disease. Finally, the distribution of time elapsed since last follow-up is checked in the same two ways for any significant imbalance between patients in two categories of menopausal status (pre- or perimenopausal; postmenopausal), two categories of axillary nodal status (negative; positive), two categories of oestrogen receptor status (poor; positive) and two categories of progesterone receptor status (poor; positive).

Where patient serial numbers form an obvious sequence it is checked for missing numbers.

A tabulated breakdown of variables is produced for each trial, together (where relevant) with lists of patents in 'problematical' categories such as those with lapsed follow-up, uncertain death cause or second malignancy site. Graphs of accrual date and the proportion of living patients still on follow-up as a function of time from randomisation by treatment allocation are also produced, together with Kaplan-Meier life-table curves. Before trial data are finally incorporated into the overview, the analyses described above are sent to the participating trialist(s) for checking and approval.


Please address inquiries concerning data preparation and checking to:

Specification of BC 1995 'pink form' format

Item Description FORTRAN Columns Details Abbreviation
Trial/stratum identifying code  I6  1 - 6  Trial 
Patient identifier (or sequence number)  A12  8 - 19  Patient 
Randomisation date  I6  21 - 26  DDMMYY  Rand. Date 
Treatment group allocated (as on master list)  I1  28  Trt. Grp 
Randomisation age  I2  30 - 31  years  Age 
5 Menopausal status on entry  I1  33 
Value Description Abbreviation
Pre-menopausal  Pre 
Peri-menopausal  Peri 
Post-menopausal  Post 
Artificial  Arti 
Surgery: first mastectomy  I2  35 - 36 
Value Description Abbreviation
Radical  Rdcl 
Total  Totl 
Simple without clearance  SimN 
Partial with clearance  ParY 
Partial without clearance  ParN 
Lumpectomy with clearance  LumY 
Lumpectomy without clearance  LumN 
Partial, clearance unknown  Par? 
Lumpectomy, clearance unknown  Lum? 
10  Subcutaneous  Subc 
11  Simple with clearance  SimY 
12  Other  Othr 
13  None  None 
Axillary status on entry  I2  37 - 38 
Value Description Abbreviation
N0 (clearance)  pN0 
N1-3 (clearance)  pN1-3 
N4+ (clearance)  pN4+ 
N- (sample only)  sN- 
N+ (sample only)  sN+ 
N- (clinical)  cN- 
N+ (clinical)  cN+ 
N- (method unknown)  ?N- 
N+ (method unknown)  ?N+ 
10  N+ (clearance)  pN+ 
11  Benign lesion  Benign 
12  N- (clinical) N0 (clearance)  cN-pN0 
13  N- (clinical) N+ (clearance)  cN-pN+ 
14  N+ (clinical) N0 (clearance)  cN+pN0 
15  N+ (clinical) N+ (clearance)  cN+pN+ 
16  Not breast cancer  Not BC 
Oestrogen receptor measurement (fmol/mg protein)  I4  40 - 43 
Value Description Abbreviation
fmol/mg protein 
-1  Negative  ERpoor 
-2  Marginal  ERpoor 
-3  Positive  ER+ 
-4  < 10 fmol/mg protein  ERpoor 
-5  10 - 19 fmol/mg protein  ER+ 
-6  20 - 29 fmol/mg protein  ER+ 
-7  30 - 49 fmol/mg protein  ER+ 
-8  50 - 99 fmol/mg protein  ER+ 
-9  100+ fmol/mg protein  ER++ 
-10  10 - 29 fmol/mg protein  ER+ 
-11  30 - 100 fmol/mg protein  ER+ 
-12  10 - 99 fmol/mg protein  ER+ 
-13  0 fmol/mg protein  ER0 
-14  10 - 49 fmol/mg protein  ER+ 
Est. Rec., E.R. 
Progesterone receptor measurement  I4  45 - 48 
Value Description Abbreviation
fmol/mg protein 
-1  Negative  PRpoor 
-2  Marginal  PRpoor 
-3  Positive  PR+ 
-4  < 10 fmol/mg protein  PRpoor 
-5  10 - 19 fmol/mg protein  PR+ 
-6  20 - 29 fmol/mg protein  PR+ 
-7  30 - 49 fmol/mg protein  PR+ 
-8  50 - 99 fmol/mg protein  PR+ 
-9  100+ fmol/mg protein  PR++ 
-10  10 - 29 fmol/mg protein  PR+ 
-11  30 - 100 fmol/mg protein  PR+ 
-12  10 - 99 fmol/mg protein  PR+ 
-13  0 fmol/mg protein  PR0 
-14  10 - 49 fmol/mg protein  PR+ 
Prg.Rec., P.R. 
10  Site of second malignancy  I4  50 - 53 
Value Description Abbreviation
140x  Lip  Lip 
141x  Tongue  Tongue 
142x  Salivary glands  Salivary 
143x  Gingival  Gingival 
144x  Mouth floor  MouFloor, OraF 
145x  Oral (general)  Oral 
146x  Oropharyngeal  Opharynx 
147x  Nasopharyngeal  Npharynx 
148x  Hypopharyngeal  Hpharynx, Hpha 
149x  Oral / throat (general)  OralThro, Oral 
150x  Oesophageal  Oesophag, Oeso 
151x  Gastric  Gastric, Gast 
152x  Small intestine / duodenal  SmallInt, SBow 
153x  Colon / colorectal (general)  Colon, CoRe 
154x  Rectal  Rectal, Rect 
155x  Hepatic  Hepatic, Hepa 
156x  Gall bladder / ext. biliary  Gbladder, Gbla 
157x  Pancreatic  Pancreas, Panc 
158x  Peritoneal / retroperitoneal  Periton. 
159x  Digestive / peritoneal (general)  DigePeri 
160x  Nasal / middle ear  E.N.T. 
161x  Laryngeal  Larynx, Lary 
162x  Tracheal / lung / bronchial  LungBron, Lung 
163x  Pleural  Pleura 
164x  Thymus / heart / mediastinum  Mediast. 
165x  Respiratory / thoracic (general)  RespThor, ReDi 
170x  Bone  Bone 
171x  Soft tissue  SoftTiss, Soft 
172x  Melanoma  Melanoma, Mela 
173x  Skin (general)  Skin 
174x  Contralateral breast cancer  ContraBC, Cont 
179x  Uterine (general) Uterine 
180x  Cervical  Cervical, Cerv 
182x  Endometrial / corpus uteri  Endomet., Endo 
183x  Ovarian  Ovarian, Ovar 
184x  Genital tract (general)  GenitalT, GenT 
188x  Bladder  Bladder, Blad 
189x  Renal / urinary system (general)  Urinary, Urol 
190x  Ocular  Ocular, Eye 
191x  Cerebral  Cerebral, Brai 
192x  C.N.S. (general)  C.N.S. 
193x  Thyroid  Thyroid, Thyr 
194x  Endocrine glands (general)  Endocrin 
195x  Ill-defined site  Ill-defd 
196x  Lymph Nodes  LymphNod 
197x  Respiratory / digestive systems  RespDige 
199x  Not stated  ?/Unk. 
1999  Unknown site  Site?/Unk. 
200x  Lymphosarcoma / reticulosarcoma L/RSarc
201x  Hodgkin's disease  HodgkinD, Hodg 
202x  Non-Hodgkin's lymphoma / histiocyt.  N.H.L. 
204x  Lymphoid leukaemia  LympLeuk 
205x  Myeloid leukaemia  MyelLeuk 
207x  Other leukaemia  Leukaem., Leuk 
208x  Leukaemia (general)  Leukaem., Leuk 
210x  Oral, benign  BenOral 
211x  Digestive, benign  BenDiges 
212x  Respiratory, benign  BenRespi 
225x  Cerebral, benign  BenBrain 
226x  Thyroid, benign  BenThyr 
237x  Endocrine / nervous, uncertain  EndNerUn, ENU 
238x  Uncertain  Uncert 
273x  Plasma protein disorder  PlasProt 
284x  Aplastic anaemia  Anaemia 
289x  Other blood disease  BloodDis 
2nd Malig. 
11  Date of second malignancy  I6  55 - 60 DDMMYY  2nd Mal. Date 
12  Distant/unknown recurrence  I1  62 
Value Description Abbreviation
No  No 
Distant  Dist 
Unknown site  Unk. 
Dist. Rec 
13  Date of first distant/unk. recurrence  I6  64 - 69  DDMMYY  D-Rec. Date 
14  Prior local recurrence  I1  71 
Value Description Abbreviation
No  No 
Yes  Yes 
Ipsilateral  Ipsi 
Other locoregional  Othr 
Local Rec 
15  Date of prior local recurrence  I6  73 - 78  DDMMYY  L-Rec. Date 
16  State when last traced  I1  80 
Value Description Abbreviation
Alive  Alive 
Dead  Dead 
Lost  Lost 
Utterly lost  Lost+ 
Alive, ineligible for protocol  A/in. 
Dead, ineligible for protocol  D/in. 
Lost, presumed dead  LostD 
10  Lost and ineligible  L/in. 
11  Utterly lost and ineligible  L+/in. 
17  Date died or last traced  I6  82 - 87  DDMMYY  L.F.U. 
18  ICD revision (extra category)  I1  89 ICDRev 
19  Cause of death (ICD)  I4  90 - 93  ICD  D.ICD 
20  Cause of death (extra category)  I2  95 - 96 
Value Description Abbreviation
1 Iatrogenic  Iatro 
Pneumonia  Pneum 
Lymphatic and haematopoietic  Leuk 
Other second neoplasm  Neop2 
Heart  Heart 
Thrombotic or embolic  ThEmb 
Cerebrovascular  Cvasc 
Extraneous cause  Extra 
Not 1 - 8, 13 - 16 and not breast cancer  NotBC 
10  Unspecified non-breast-cancer cause  NotBC 
11  Breast cancer or its metastases  BCMet 
12  Unascertainable cause  Unkn. 
13  Respiratory  Respi 
14  Hepatic  Hepat 
15  Non-pneumonia infective  Infec 
16  Other vascular  OVasc 
21  Name (if given) and comments  98 - end  Name 
Missing or unknown items are left blank or set to zero.

BC 1995 data form rubric



Special coding conventions


Patient identifier


Site 2nd malignancy / contralateral breast cancer

Guide to coding some death causes in BC 1995 data format

[Return to the Contents List]

[End of document, updated to 11 July 2000]