Proposals:Statistics Framework Runtime Vector Size: Difference between revisions
Line 108: | Line 108: | ||
== API changes / additions == | == API changes / additions == | ||
1. itk::Sample | 1. <tt>itk::Sample</tt> | ||
This class now supports a method to set/get the MeasurementVector length. This must be set explicitly in cases where measurement vectors are variable size containers (<tt>itk::Array</tt> etc) as below. | |||
typedef itk::Sample< Array < double > > SampleType; | |||
SampleType::Pointer sample = SampleType::New(); | |||
sample->SetMeasurementVectorSize( length ); | |||
SampleType::MeasurementVectorType m(length); | |||
m.Fill( 4.57 ); | |||
sample->PushBack( m ); | |||
The earlier method will still be valid, along with all the macros. For instance the following would also work | |||
const unsigned int length = 3; | |||
typedef itk::Sample< FixedArray < double, length > > SampleType; | |||
SampleType::Pointer sample = SampleType::New(); | |||
SampleType::MeasurementVectorType m; | |||
m.Fill( 4.57 ); | |||
sample->PushBack( m ); | |||
typedef itk::Sample< Array< double > > | The use of the MeasurementVectorSize macro to get the length of the measurement vector is deprecated. For instance, | ||
typedef itk::Sample< FixedArray < double, 3 > > SampleTypeA; | |||
std::cout << SampleTypeA::MeasurementVectorSize << std::endl; | |||
typedef itk::Sample< Array < double > > SampleTypeB; | |||
std::cout << SampleTypeB::MeasurementVectorSize << std::endl; | |||
will produce 3 in the first case and 0 in the second. The appropriate/consistent way to do this is to get the size using the get macros <tt>sample->GetMeasurementVectorSize()</tt>which will yield 3 in both cases. | |||
All classes that derive from sample or do filtering operations on sample (which is most classes) query the sample for the length of the measurement vector. | |||
2. DistanceMetrics | |||
This class also contains methods to set/Get measurement vector length. As before this only needs to be set in cases here the measurement vector is of variable size. For instance the following code fragments are equivalent. | |||
typedef itk::Vector< float, 2 > MeasurementVectorType; | |||
typedef itk::Statistics::EuclideanDistance< MeasurementVectorType > DistanceMetricType; | |||
DistanceMetricType::Pointer distanceMetric = DistanceMetricType::New(); | |||
DistanceMetricType::OriginType originPoint; | |||
MeasurementVectorType queryPointA; | |||
MeasurementVectorType queryPointB; | |||
originPoint[0] = 0; | |||
originPoint[1] = 0; | |||
queryPointA[0] = 2; | |||
queryPointA[1] = 2; | |||
queryPointB[0] = 3; | |||
queryPointB[1] = 3; | |||
distanceMetric->SetOrigin( originPoint ); | |||
std::cout << "Euclidean distance between the two query points (A and B) = " | |||
<< distanceMetric->Evaluate( queryPointA, queryPointB ) << std::endl; | |||
typedef itk::Array< float > MeasurementVectorType; | |||
typedef itk::Statistics::EuclideanDistance< MeasurementVectorType > DistanceMetricType; | |||
DistanceMetricType::Pointer distanceMetric = DistanceMetricType::New(); | |||
DistanceMetricType::OriginType originPoint( 2 ); | |||
MeasurementVectorType queryPointA( 2 ); | |||
MeasurementVectorType queryPointB( 2 ); | |||
originPoint[0] = 0; | |||
originPoint[1] = 0; | |||
queryPointA[0] = 2; | |||
queryPointA[1] = 2; | |||
queryPointB[0] = 3; | |||
queryPointB[1] = 3; | |||
distanceMetric->SetOrigin( originPoint ); | |||
std::cout << "Euclidean distance between the two query points (A and B) = " | |||
<< distanceMetric->Evaluate( queryPointA, queryPointB ) << std::endl; |
Revision as of 17:59, 19 July 2005
Refactoring the Statistics Framework to have Runtime Length
Currently, the Statistics Framework requires the MeasurementVector to have a length defined at compile time.
Rationale for having compile time length
The statistics classes in ITK have MeasurementVectorSize (length of each measurement vector) as a static const value. This has until now been sufficient since typical statistics operations involve sampling an image where the number of measurement vectors is a variable, but the measurement vector size is usually fixed and depends on the dimension of the parametric space.
Rationale for having run time length
For algorithms such as Normalized cuts [1] and other Kernel PCA feature space projection techniques [2], it may be necessary to keep the dimensionality of the feature space as a variable. This requires removing MeasurementVectorSize as a static method and making it an iVar.
[1] PAMI - Vol26, No2, Spectral Grouping using the Nystrom method , Feb 2004
[2] Neural Computation - Nonlinear component analysis as a Kernel Eigenvalue problem, vol 10, 1998
Proposed Implementation Plan
This requires removing MeasurementVectorSize as a static method and making it an iVar.
FixedArrays, itk::Matrix and vnl_fixed that are templated over MeasurementVectorSize will have to be replaced by itk::Array, vnl_matrix and vnl_vector, where the size may be chosen at run time.
The SymmetricEigenAnalysis class must be used for eigen analysis since it is not templated over the dimension.
Bounds checking will have to be performed manually on all methods that use these FixedArrays.
API hiccups are unavoidable.. Warning macros will have to be provided. An open question is how to appropriately provide deprecated warning macros. I would suggest we place them in the constructor of all affected classes.
Very few statistics classes are used outside the statistics group. One of them is itk::Histogram.. Affects Histogram metrics. Typedefs like IndexType and SizeType will have to be replaced.
Testing
See NAMICSandbox/RefactoringStatisticsClasses/
The goal is to test the Tests in the folder running fine before moving them to ITK. The tests can be run as usual after building with CMake. See NAMICSandbox/RefactoringStatisticsClasses/README
Currently the following tests pass:
itkCovarianceCalculatorTest itkDenseFrequencyContainerTest itkHistogramTest itkKdTreeGeneratorTest itkListSampleTest itkListSampleToHistogramFilterTest itkListSampleToHistogramGeneratorTest itkListSampleToHistogramFilterTest itkListSampleToHistogramGeneratorTest itkMembershipSampleTest itkMembershipSampleGeneratorTest itkMeanCalculatorTest itkNeighborhoodSamplerTest itkStatisticsAlgorithmTest itkSubsampleTest itkVariableDimensionHistogramTest itkWeightedMeanCalculatorTest itkWeightedCovarianceCalculatorTest
Proposed Transition Plan
Current Status
1. VariableDimensionHistogram
Class to handle variable length histograms added in NAMIC sandbox NAMICSandBox/RefactoringITKStatisticsClasses/src/itkVariableDimensionHistogram.h, .txx NAMICSandBox/RefactoringITKStatisticsClasses/Tests/itkVariableDimensionHistogramTest.cxx
The class is similar to itk::Histogram with modifications to allow the dimension of the histogram (which is dependent on the size of each measurement vector) to be set at run-time.
2. VariableSizeMatrix
NAMICSandBox/RefactoringITKStatisticsClasses/src/itkVariableSizeMatrix
Similar to itk::Matrix with a similar API
3. MeasurementVectorTraits To have consistent API, we've created traits for measurement vectors. The traits are templated over the MeasurementVectorType (which was earlier constrained to be of type FixedArray or its subclasses.). For run-time size capability, we need to support itk::Array and possibly other containers like vnl_vector. To have a consistent way of dealing with GetSize(), SetSize calls etc, traits are used. This class is templated over the MeasurementVectorType. [From the doxygen headers for the class]
* For instance, the developer can create a measurement vector as * * typename SampleType:: MeasurementVectorType m_MeasurementVector * = MeasurementVectorTraits< typename * SampleType::MeasurementVectorType >::SetSize( s ) ); * * This will create a measurement vector of length s if it is a FixedArray or * a vnl_vector_fixed, itkVector etc.. If not it returns an array of length 0 * for the appropriate type. Other useful typedefs are defined to get the * length of the vector, for the MeanType, RealType for compuatations etc * * To get the length of a measurement vector, the user would * * MeasurementVectorTraits< MeasurementVectorType >::GetSize( &mv ) * * This calls the appropriate functions for the MeasurementVectorType to return * the size of the measurement vector mv. * * MeasurementVectorTraits< MeasurementVectorType >::GetSize() * * This returns the length of MeasurementVectorType, which will be the true * length of a FixedArray, Vector, vnl_vector_fixed, Point etc and 0 otherwise
NAMICSandBox/RefactoringITKStatisticsClasses/src/itkMeasurementVectorTraits.h
API changes / additions
1. itk::Sample
This class now supports a method to set/get the MeasurementVector length. This must be set explicitly in cases where measurement vectors are variable size containers (itk::Array etc) as below.
typedef itk::Sample< Array < double > > SampleType; SampleType::Pointer sample = SampleType::New(); sample->SetMeasurementVectorSize( length ); SampleType::MeasurementVectorType m(length); m.Fill( 4.57 ); sample->PushBack( m );
The earlier method will still be valid, along with all the macros. For instance the following would also work
const unsigned int length = 3; typedef itk::Sample< FixedArray < double, length > > SampleType; SampleType::Pointer sample = SampleType::New(); SampleType::MeasurementVectorType m; m.Fill( 4.57 ); sample->PushBack( m );
The use of the MeasurementVectorSize macro to get the length of the measurement vector is deprecated. For instance,
typedef itk::Sample< FixedArray < double, 3 > > SampleTypeA; std::cout << SampleTypeA::MeasurementVectorSize << std::endl; typedef itk::Sample< Array < double > > SampleTypeB; std::cout << SampleTypeB::MeasurementVectorSize << std::endl;
will produce 3 in the first case and 0 in the second. The appropriate/consistent way to do this is to get the size using the get macros sample->GetMeasurementVectorSize()which will yield 3 in both cases.
All classes that derive from sample or do filtering operations on sample (which is most classes) query the sample for the length of the measurement vector.
2. DistanceMetrics
This class also contains methods to set/Get measurement vector length. As before this only needs to be set in cases here the measurement vector is of variable size. For instance the following code fragments are equivalent.
typedef itk::Vector< float, 2 > MeasurementVectorType; typedef itk::Statistics::EuclideanDistance< MeasurementVectorType > DistanceMetricType; DistanceMetricType::Pointer distanceMetric = DistanceMetricType::New(); DistanceMetricType::OriginType originPoint; MeasurementVectorType queryPointA; MeasurementVectorType queryPointB; originPoint[0] = 0; originPoint[1] = 0; queryPointA[0] = 2; queryPointA[1] = 2; queryPointB[0] = 3; queryPointB[1] = 3; distanceMetric->SetOrigin( originPoint ); std::cout << "Euclidean distance between the two query points (A and B) = " << distanceMetric->Evaluate( queryPointA, queryPointB ) << std::endl;
typedef itk::Array< float > MeasurementVectorType; typedef itk::Statistics::EuclideanDistance< MeasurementVectorType > DistanceMetricType; DistanceMetricType::Pointer distanceMetric = DistanceMetricType::New(); DistanceMetricType::OriginType originPoint( 2 ); MeasurementVectorType queryPointA( 2 ); MeasurementVectorType queryPointB( 2 ); originPoint[0] = 0; originPoint[1] = 0; queryPointA[0] = 2; queryPointA[1] = 2; queryPointB[0] = 3; queryPointB[1] = 3; distanceMetric->SetOrigin( originPoint ); std::cout << "Euclidean distance between the two query points (A and B) = " << distanceMetric->Evaluate( queryPointA, queryPointB ) << std::endl;