CONTENTS

VI.     Experimental Analysis

A.     The CT Colonography Image Data Set

 

The proposed Optimised Kernel KPCA and AKFA together with AKFA and KPCA were evaluated using CT image data sets of colonic polyps comprised of True Polyps (TP) and False Polyps (FP) detected by our CAD system [5]. We obtained studies of 146 patients who had undergone a colon-cleansing regimen in preparation for same-day optical colonoscopy. Each patient was scanned in both supine and prone positions, resulting in a total of 292 CT studies. Helical single-slice and multi-slice CT scanners (GE HiSpeed CTi, LightSpeed QX/I, and LightSpeed Ultra; GE Medical Systems, Milwaukee, WI) were used, with collimations of 1.25 - 5.0 mm, reconstruction intervals of 1.0 – 5.0 mm, X-ray tube currents of 50 – 260 mA and voltages of 120 – 140 kVp.  In-plane voxel sizes were 0.51– 0.94 mm, and the CT image matrix size was 512 x 512. Out of 146 patients, there were 108 normal cases and 38 abnormal cases with a total of 61 colonoscopy-confirmed polyps larger than 6 mm. Twenty-eight polyps were 6-9 mm and 33 polyps were larger than 10 mm (including 7 lesions larger than 30 mm).  The CAD scheme processed the supine and prone volumetric data sets generated from a single patient independently to yield polyp candidates.  The CAD scheme detected polyp candidates in the 292 CT colonography data sets with a 98% by-polyp detection sensitivity.

The volumes of interest (VOIs) representing each polyp candidate have been calculated as follows. The CAD scheme provided a segmented region for each candidate. The center of the VOI was placed at the center of mass of the region. The size of the VOI was chosen so that the entire region was covered. Finally, the VOI was resampled to 16ื16ื16 voxels. The VOIs so computed comprise the data set DB1, a sample of which is shown in Figure 1. There were a total of 131 true polyps (some of the larger lesions had multiple detections) and 8008 false polyps.  The same procedure has been carried out using VOIs with dimensions 12ื12ื12 voxels to build the data set DB2 that consists of 39 true polyps and 149 false polyps. 

B.     Computational Time with Data Sizes: Experiment with DB1

The results on the computation time, in seconds, are shown in Table I. All the results were obtained using the Statistical Pattern Recognition Toolbox [17] on Matlab 7.0.1 (R14) for the Gram matrix calculation and KPCA algorithm, running on the Partners Research Computing cluster [18].  The cluster had 26 working nodes using an HP server.  Each node had 72 GB storage (head node 380 GB storage), and two 3 GHz AMD Opteron 32/64 CPUs and 4 GB RAM.  The nodes communicated via a GigE switch and used an NFS mount. For each of the algorithms, KPCA, SKFA, and AKFA, Table I indicates that the computational time of KPCA increased rapidly with the increase of the data size n.  We set the eigen-dimension to 70 for measuring computation time. At n = 3500, the computation time of KPCA and SKFA were 9.4 and 30.5 times longer than that of AKFA.  If the computational time versus data size n is plotted with common-logarithm scales, the results fit the expected curves, and thus validate the complexity analyses in the methodology sections. This clearly shows that our proposed AKFA was much faster than the existing methods of KPCA and SKFA, especially when the data size is large.

C.     Evaluation of Classification Accuracy: Optimized Kernel versus Unoptimized Kernel for DB2 data set:

 

 In order to analyze how optimized kernel effects the classification performance of polyp candidates, we used the k-nearest neighborhood classifier on the image vectors in the reduced eigenspace.  The data set DB2 was used in the experiments described in this section. . The training data and test data were selected according to the arrangement given in Table I. We first evaluated the performance of the classifier by applying it to the feature spaces obtained by KPCA , AKFA and then evaluated the performance of the classifier by applying it to kernel optimized feature spaces obtained by KPCA (Case 1) and AKFA(Case 2).

 

Case1: KPCA. The k-nearest neighborhood classifier was applied on the data after the feature extraction using the KPCA algorithm with data dependent kernel, which was used to extract a total of 75 features during the training. The data set as described in Arrangement 1 in Table I was used, and 1 – 10 nearest neighbors were considered. The results for classification accuracy against the number of nearest neighbors are given in Table II.  When 9 nearest neighbors were considered for classification, the test data in the reduced eigenspace were grouped as given in Table I, resulting in a classification accuracy of 97.50%.

 Case2: AKFA. The AKFA algorithm has been applied on the same training data set given in Table I to extract 75 features. Then the test data were classified using k-nearest neighborhood classifier considering 1 – 10 nearest neighbors; the results are summarized in Table II.

 

Table II: Arrangement of Training and Test data for classification. The data set DB2 was comprised of 39 true polyp data and 149 false polyp data.

 

 

 

 

Proportion from the total data set

Number of vectors

Total

Arrangement 1

Training Set

TP

80.00%

31

148

FP

78.30%

117

Test Set

TP

20%

8

40

FP

21.70%

32

 

 

Table III: Classification accuracy for each feature extraction algorithm against the number k of nearest neighbors after kernel optimization:

 

No. of Nearest Neighbors

Classification Accuracy % (with kernel optimization)

Classification Accuracy % (without kernel optimization)

KPCA

AKFA

KPCA

AKFA

1

100

97.50

97.5

95.00

2

97.50

92.50

100

90

3

97.50

95.00

95.5

95

4

100

100

100

90

5

100

100

100

92.5

6

95.00

92.50

97.5

92.5

7

97.50

92.50

97.5

92.5

8

95.00

95.00

95

90

9

97.50

92.50

95

92.5

10

92.50

90.00

92.5

87.5

 

 

D.     Reconstruction Error:

  The reconstruction error results for kernel optimized KPCA, AKFA, and KPCA, AKFA for this data set are summarized in Table III. The results show that the reconstruction ability of kernel optimized KPCA and AKFA have similar performance as that of KPCA and AKFA, made evident by the smaller reconstruction error. However, given more training data coupled with the ability to extract more features would have resulted in a more accurate representation of data in the reduced eigenspace, and therefore in comparable results for KPCA and AKFA

 

 Table IV: Mean Square Reconstruction Error for DB2

 

Feature Extraction Algorithm

Mean Square Error (%)

KPCA

6.74

AKFA

10.74

WKOKPCA

6.84

WKOAKFA

10.99