Computer Methods For Pulmonary Nodule Characterization From Ct Images
Computed tomography (CT) scans provide radiologists a non-invasive method of imaging internal structures of the body. Although CT scans have enabled the earlier detection of suspicious nodules, these nodules are often small and difficult to accurately classify for radiologists. An automated system was developed to classify a pulmonary nodule based on image features extracted from a single CT scan. Several critical issues related to performance evaluation of such systems were also examined. The image features considered in the system were: statistics from the density distribution, shape, curvature, and boundary features. The shape and density features were computed through moment analysis of the segmented nodule. Local curvature was computed from a triangle-tessellated surface of the nodule; the statistics of the distribution of curvatures were used as features in the system. Finally, the boundary of the nodule was examined to quantify the transition region between the nodule and lung parenchyma. This was accomplished by combining the grayscale information and 3D model to measure the gradient on the surface of the nodule. These methods resulted in a total of 43 features. For compari- son, 2D features were computed for the density and shape features, resulting in 26 features. Four feature classification schemes were evaluated: logistic regression, k-nearest-neighbors, distance-weighted nearest-neighbors, and support vector machines (SVM). These features and classifiers were validated on a large dataset of 259 nodules. The best performance, an area under the ROC curve (AUC) of 0.702, was achieved using 3D features and the logistic regression classifier. A major consideration when evaluating a nodule classification system is whether the system presents an improvement over a baseline performance. Since the majority of large nodules in many datasets are malignant, the impact of nodule size on the performance of the classification system was examined. This was accomplished by comparing the performance of the system with feature sets that included sizedependent features to feature sets that excluded those features.The performance of size alone, estimated using a size-threshold classifier, was an AUC of 0.653. For the SVM classifier, removing size-dependent features reduced the performance from an AUC of 0.69 to 0.61. To approximate the performance that might be obtained on a dataset without a size bias, a subset of cases was selected where the benign and malignant nodules were of similar sizes. On this subset, size was not a very powerful feature with an AUC of 0.507, and features that were not dependent on size performed better than size-dependent features for SVM, with an AUC of 0.63 compared to 0.52. While other methods have been proposed for performing nodule classification, this is the first study to comprehensively look at the performance impact from datasets with nodules that exhibit a bias in size.
pulmonary nodule; characterization; lung cancer
Reeves, Anthony P
M.S., Electrical Engineering
Master of Science
dissertation or thesis