topeeng.gif (8383 bytes)

[ Scientific Activity - Actividad Científica ] [ Brief Communications - Temas Libres ]

Evaluation of a program for automatic interpretation of electrocardiograms

González Fernández, René; Rivero Varona, Marta M.; Fernández Núñez, Raisa

Instituto Central de Investigación Digital. Hospital Clínico Quirúrgico ¨Hermanos Ameijeiras¨
La Habana, Cuba

Abstract
Introduction
Objectives
Material and Methods
Results
Discussion
Conclusions

Abstract
Introduction: Automatic interpretation of electrocardiogram (ECG) is a very used technique nowadays because of the advantages it brings in every day practice. In Cuba, this speciality was first worked with in 1984 and the family of equipments CARDIOCID devoted to this task has been created. The aim of this work is to show the outcomes of the evaluation carried out to confirm the effectiveness of the algorhythms employed for the automatic interpretation of ECG included in CARDIOCID equipments.
Material and Methods: 706 ECG were taken with two equipments installed in "Hermanos Ameijeiras" Hospital. All ECGs were performed with a recording velocity of 25 mm/s, sensibility of 10mm/mV and the digital filter activated. For each case, the equipment acquired eight seconds of ECG, identified and measured the events of interests and arrived to conclusions about the patient’s state. All the information was recorded in disks and checked afterwards by an specialist in Cardiology, with more than 15 years of experience in visual revision of ECGs, in a computer with the assistance of a program which made this task easier. The specialist’s criterion was always assumed to be true and Sensibility (S), Positive Predictive Value (PPV), Specificity (E), and Negative Predictive Value (NPV) were calculated.
Results: the sample was integrated by 279 normal and 427 pathologic cases. Among the normal cases, there were only two false negatives (FN) and there were no false positives (FP), and for this reason Sensibility reached a 99,28% and PPV was of 100%. Generally, the equipment emitted 983 diagnosis criteria and 952 of these were True Positives (TP), 31 FP and 8 FN. Overall Sensibility was of 99,17% and PPV was of 96,85%.
Discussion: CARDIOCID-BS was very effective in the face of normal cases, there were no FP and only two FN were produced. Global sensibility was very high because the equipment identified most of present alterations. PPV also had an elevated value when the majority of diagnosis made by CARDIOCID-BS were right. There were few FN, only eight, among almost 1000 situations that were to identify. The amount of FP was superior to that of FN, being thus demonstrated a slight tendency to over diagnosis. The mistakes were very focused on some of the criteria associated with auricular activity, which is a characteristic of these methods as it has been internationally described.
Conclusions: The results obtained confirm the practical usefulness of CARDIOCID-BS for automatic interpretation of ECG.

Top

Introduction: The technique of automatic interpretation of the electrocardiogram (ECG) has been booming since the 80’s with the aim of substituting the visual inspection of the printed ECG report by a specialist, which has been the traditionally used method. It’s main advantages are the stability in the quality of the ECG interpretation process, the standardization in the used criterions and the possibility of divulging the knowledge of highly qualified specialists. Besides, this technique allows the creation of data bases to support researching work and the equipment which use this program may be used for teaching purposes. There have been created four models (B, M, MC and BS) of the CARDIOCID digital electrocardiograph in Cuba, since 1984 when studies in this field began.

Objectives: The aim of this paper is the evaluation of the effectiveness of the automatic ECG interpretation algorithms used in the digital electrocardiograph CARDIOCID-BS.

Material and Methods: 706 ECGs were recorded using two CARDIOCID-BS electrocardiographs installed at the Clinical Surgical Hospital "Hermanos Ameijeiras", in Havana city, Cuba. All the recordings were made under the same conditions (paper speed 25 mm/s, amplitude scale 10 mm/mV and the digital filter of the equipment activated). The CARDIOCID-BS electrocardiographs digitize simultaneously the twelve leads of the standard ECG using a 12 bits A/D converter (3.15 m V / converter unit) at a sampling rate of 500 Hz and use for the identification and measurement of the electrocardiographic events the last eight seconds of recorded signal. A set of medical criterions are evaluated and certain conclusions are reached based on this measurements. All the information related to each recorded case is also stored in magnetic discs which can be used in IBM compatible personal computers.

A Cardiology specialist, with a 15 years experience in the visual inspection of ECGs, analyzed the signal corresponding to each recorded case and gave a diagnose, which was taken as the criterion of truth or a reference standard for the evaluation of the equipment in the signal interpretation. For the analysis was used a program with a graphical interface allowing the specialist to read the information related to every record and to review the ECG more in detail than in a printed report and in the same traditional format, that is, the signal was displayed on a millimetric grid in order to take advantage of the specialist visual training. The specialist didn´t know the automatic diagnose made by the equipment. The following variables were studied: True Positive (TP), False Positive (FP), True Negative (TN) and False Negative (FN). Sensitivity (S), Positive Predictivity (PP), Specificity (Sp) and Negative Predictivity (NP) were calculated based on this variables.

Table 1. Sample composition.

 

Total

Normal ECG

279

Patologic ECG

427

Total

706

Top

Results: 279 normal and 427 pathological patients were included in the evaluating sampling. From the 279 normal patients, 277 were correctly diagnosticated by the equipment, that is, there were two FN. There wasn’t any FP in the diagnosis of the normality as the algorithm never interpreted as normal a patient who wasn’t.

In general, the equipment emitted 983 diagnostics in the analysis of the 706 cases, from which 952 agreed with those given by the specialist. There were eight diagnostic criterions given by the cardiologist that were missed by the algorithm (FN) and 31 diagnostic criterions emitted by the algorithm that were not given by the specialist (FP). The Sensibility was in general 99.71% while the Positive Predictivity was 96.85%.

From the total of 77 diagnostic criterions included in the CARDIOCID-BS, the errors were concentrated in ten of them, which represents a 15.58% of the used criterions, and these were mainly a few criterions related to the disorders in the atrial activity. 69 Criterions of the total of 77 used in the CARDIOCID-BS were present at least once in the evaluating sampling, which is equivalent to the 89.61%.

Table 2. Automatic interpretation performance for normal patients.

 

Quantity

Normal ECGs

279

True Positives

277

False Positives

0

False Negatives

2

Sensitivity

99,28%

Positive Predictivity

100%

Table 3. Global performance of the ECG automatic interpretation.

Patients

706

Diagnostics made by CARDIOCID-BS

983

True Positives

952

False Positives

31

False Negatives

8

Sensitivity

99,17%

Positive Predictivity

96,85%

Table 4. Errors by diagnostic criterions.

Code

Criterion

TP

FN

Total

1

ECG normal

0

2

2

7

RR variable. Descartar contracciones prematuras.

1

1

2

10

RR variable y ausencia de onda P. Descartar fibrilación auricular.

9

1

10

19

La duración de P excede 130 ms. Crecimiento de la A.I. y/o T.C.I.A.

4

1

5

20

Onda P negativa de 0.14 mV o más en V1. Crecimiento de la A.I. y/o T.C.I.A.

3

1

4

22

PR menor que 100 ms.

10

1

11

64

T negativa en dos derivaciones.
Trastornos inespecíficos de la repolarización ventricular.

1

1

2

67

QS presente en V2 o V3. Descartar I.M. o fibrosis anteroseptal.

1

0

1

68

Ondas Q presentes en precordiales. Descartar I.M. anterior.

1

0

1

71

R <= 0.1 mV y T negativa en dos deriv. de V2 a V5. Compatible con I.M. anterior.

1

0

1

 

Total

31

8

39

Table 5. Performance of the studied variables.

Criterion

TP

FP

TN

FN

S

Sp

PP

NP

1

277

0

427

2

99,28

100

100

99,53

2

0

0

706

0

100

100

100

100

3

14

0

692

0

100

100

100

100

4

13

0

693

0

100

100

100

100

7

18

1

687

1

94,74

99,85

94,74

99,85

10

31

9

666

1

96,88

98,67

77,50

99,85

12

0

0

706

0

100

100

100

100

13

0

0

706

0

100

100

100

100

14

9

0

697

0

100

100

100

100

18

5

0

701

0

100

100

100

100

19

29

4

673

1

96,67

99,41

87,88

99,85

20

32

3

671

1

96,97

99,55

91,43

99,85

21

3

0

703

0

100

100

100

100

22

44

10

652

1

97,78

98,49

81,48

99,85

23

2

0

704

0

100

100

100

100

24

6

0

700

0

100

100

100

100

25

0

0

706

0

100

100

100

100

26

0

0

706

0

100

100

100

100

27

7

0

699

0

100

100

100

100

28

16

0

690

0

100

100

100

100

29

7

0

699

0

100

100

100

100

30

4

0

702

0

100

100

100

100

31

10

0

696

0

100

100

100

100

32

0

0

706

0

100

100

100

100

33

6

0

700

0

100

100

100

100

34

2

0

704

0

100

100

100

100

35

2

0

704

0

100

100

100

100

36

4

0

702

0

100

100

100

100

37

10

0

696

0

100

100

100

100

38

7

0

899

0

100

100

100

100

39

3

0

703

0

100

100

100

100

40

1

0

705

0

100

100

100

100

41

2

0

704

0

100

100

100

100

42

4

0

702

0

100

100

100

100

43

2

0

704

0

100

100

100

100

44

8

0

698

0

100

100

100

100

45

5

0

701

0

100

100

100

100

46

0

0

706

0

100

100

100

100

47

4

0

702

0

100

100

100

100

Table 5. Performance of the studied variables.

Criterion

TP

FP

TN

FN

S

Sp

PP

NP

48

12

0

694

0

100

100

100

100

49

9

0

697

0

100

100

100

100

50

5

0

701

0

100

100

100

100

51

10

0

696

0

100

100

100

100

52

3

0

703

0

100

100

100

100

53

3

0

703

0

100

100

100

100

54

15

0

691

0

100

100

100

100

55

6

0

700

0

100

100

100

100

56

3

0

703

0

100

100

100

100

57

1

0

705

0

100

100

100

100

58

3

0

703

0

100

100

100

100

59

12

0

694

0

100

100

100

100

60

9

0

697

0

100

100

100

100

61

0

0

706

0

100

100

100

100

62

4

0

702

0

100

100

100

100

64

22

1

683

1

95,65

99,85

95,65

99,85

65

3

0

703

0

100

100

100

100

66

9

0

697

0

100

100

100

100

67

19

1

686

0

100

99,85

95,00

100

68

33

1

672

0

100

99,85

97,06

100

69

5

0

701

0

100

100

100

100

70

17

0

689

0

100

100

100

100

71

23

1

682

0

100

99,85

95,83

100

72

12

0

694

0

100

100

100

100

73

8

0

698

0

100

100

100

100

74

4

0

702

0

100

100

100

100

75

16

0

690

0

100

100

100

100

76

10

0

696

0

100

100

100

100

77

5

0

701

0

100

100

100

100

78

11

0

695

0

100

100

100

100

79

8

0

698

0

100

100

100

100

80

10

0

696

0

100

100

100

100

83

3

0

703

0

100

100

100

100

84

22

0

684

0

100

100

100

100

85

11

0

695

0

100

100

100

100

86

10

0

696

0

100

100

100

100

87

7

0

699

0

100

100

100

100

88

2

0

704

0

100

100

100

100

Total

952

31

53577

8

99,17

99,84

96,85

99,99

Top

Discussion: Although a part of the criterions emitted by the CARDIOCID-BS are tentative (include the words "discard", "compatible with" and "Possible" leaving the conclusion in the cardiologists hands), while defining the presence of a FP this was not taken in account and all the criterions were treated the same way, that is, the evaluation was strict. The great majority (69 from a total of 77) of all the medical criterions used in the CARDIOCID-BS were applied al least once which makes the results really representatives.

As we can see, there was a domain on the cases with any kind of abnormality although the patients were selected randomly. It is important to say that the CARDIOCID-BS was very effective in the diagnosis of the normality. No patient with an irregularity in the ECG was given as normal and only twice a normal case was given as pathological. For that reason, we can say that the equipment was very efficient in the classification of the patients as normal or not and this characteristic is very useful as it allows the specialists to concentrate on the analysis of the pathological cases, while the equipment will certainly not classify as normal a patient who is not. In the two errors where a normal ECG was classified as pathological, the diagnosis included the word "DISCARD", that is, it was not said that there was an abnormality, but it was indicated to the specialist to evaluate the present of a given situation, as the measurements were not decisive.

The global sensitivity was high (99.17%), which indicates that the evaluated algorithm agreed in the diagnosis of the majority of the irregularities detected by the cardiologist. The PP also had a high value (96.85%) and this means that the majority of the diagnosis emitted by the equipment were correct. The value of this parameters shows that the behavior of the equipment was significantly similar to that of the cardiologist which was taken as a standard reference in this evaluation and it proves that a good approximation in modeling the way experienced specialists analyze the ECG was achieved. There were a small quantity of FN, only eight, that is, only very few irregularities of those present in the evaluating sampling were missed. There was greatly more FP (31) than FN (8) which indicates that the equipment has a major tendency to overdiagnosticate than to miss existing abnormalities, although in none of the two cases this tendency was alarming or invalidating for their practical use.

The errors were concentrated in ten diagnostic criterions of the total of 77 included in the CARDIOCID-BS. As can be seen in the table 4, the fails were very concentrated in some of the criterions that depend on the correct identification and measurement of the representative waves of the atrial activity. It is well known that these are the most difficult events to identify by digital methods. However 92.18% of the zones corresponding to the present P waves (remember all the leads are acquired simultaneously) were correctly identified, although this value is less than the corresponding to the QRS complexes, which were all identified and had no FP.

Fifteen of the ECG with incorrect diagnosis had significant noise level due to a bad contact of the electrodes. The values of S, Sp, PP and NP were high for the great majority of the criterions and the errors were concentrated on the 10 and 22 criterions, as can be seen on the table 4.

Top

Conclusions: The results confirm the practical usefulness of the CARDIOCID-BS electrocardiograph for the automatic ECG interpretation. The main deficiencies of the process of automatic ECG interpretation were related to some criterions which have in account the measurement of the representative waves of the atrial activity (P, F and f), but these do not invalidate the utility of the evaluated equipment although the work to improve this aspect continues.

 

Questions, contributions and commentaries to the Authors: send an e-mail message (up to 15 lines, without attachments) to arritmias@listserv.rediris.es , written either in English, Spanish, or Portuguese.

Top


© CETIFAC
Bioengineering
UNER

Update
Dic/07/1999