ITK/Procedure for Contributing New Classes and Algorithms: Difference between revisions
No edit summary |
|||
Line 112: | Line 112: | ||
| Registration || Daniel Blezeck || GE | | Registration || Daniel Blezeck || GE | ||
|- | |- | ||
| Registration || Luis Ibanez || | | Registration || Luis Ibanez || Kitware | ||
|- | |- | ||
| LevelSets || Jim Miller || GE | | LevelSets || Jim Miller || GE |
Revision as of 18:53, 16 December 2005
Introduction
This page describes the procedure for contributing new algorithms and classes to the [Insight Toolkit].
The fundamental idea of this procedure is to make the [Insight Journal] to be the entry gate of new classes and algorithms to the Insight Toolkit. This means that developers should not commit new classes into the CVS repository unless they have been already posted as papers to the Insight Journal and have received positive reviews from the community.
Although this may appear as a bureaucratic procedure, it should be quite agile in practice because the Insight Journal is not a typical Journal. The time between submitting a paper and finding it posted online should be in the range of minutes for an average case, and a couple of hours as a worst case. The time difference will depend on how computing-intensive the testing for the code is.
As soon as a paper is posted online, the source code that must accompany the paper becomes also freely available online. So the time for sharing the contributions with the community should be in all cases less than 24 hours.
The following sections describe the rationale behind this procedure, and the technical details on how to prepare a submission and follow it through until the source code is committed in to the ITK CVS repository.
The Rationale
The rationale behind this procedure is to pursue the following goals
- Technical correctness of new contributions
- Avoid duplication of functionalities
- Maximize reuse of existing code
- Maximize generalization of the algorithm implementations
- Enforce validation, testing and code coverage
- Maximize maintainability
- Ensure that new algorithms are properly documented
- Gather feedback from the community
- Hold a continuously open forum where algorithmic, and performance issues are discussed.
Since some of these goals may be conflicting, it will be the prerrogative of the Oversight Committee to rule on whether one criteria should be given more importance over another one. This decisions will have to be made on a case-by-case basis.
Technical Correctness
The community should pounder whether the technical concepts behind a new algorithm are acceptable. Technical correctness requires the contributor to provide a background on the proposed algorithm. Some algorithms may be so widely known that a simple citation to a major paper describing the algorithm may be enough for satisfying the requirement of technical correctness. Less known algorithms would require more detailed descriptions in order to make the case for their technical correctness. There are no hard rules on how deep this description should be. The only clear cut criteria is that it should be clear enough for not raising major objections from the community.
Avoid duplication of functionalities
Given the large number of classes existing in the toolkit and the fact that the development effort was distributed among multiple institutions, is is not trivial for a single developer to establish whether a particular algorithm is already implemented in the toolkit. Therefore, when it comes to adding new functionality, an opportunity should be created for other developers to point out to existing code that may already provide such functionality or that may help to implement the suggested new functionality.
Maximize reuse of existing code
During the time that the submissions are exposed in the Journal, other developers and users may find that parts of the algorithm could be implemented using existing classes in the toolkit. By posting those comments in their reviews, they will help the authors to refactore their code in order to use those existing classes.
Maximize generalization of the algorithm implementations
It is common that algorithm implementation is done in the context of a very specific problem. Authors will typically post the algorithms they have used for solve a specific problem. By opening the papers to public non-anonymous reviews, readers and reviewers may find the algorithm applicable to other problems, and may suggest ways of generalizing the algorithms. In this way a larger community will benefit from the insertion of a generalized algorithm, instead of restricting the benefit for those involved with the specific problem for which the algorithm was originally intended.
Enforce validation, testing and code coverage
It is fundamental to make sure that new algorithms are working as advertised. The practical way of doing this is to provide a test with realistic data input, and typical parameters to the algorithm, in such a way that it can be run by anybody. The test should also include the expected output, so when it is executed by other users, there is a baseline for comparison that will make possible to evaluate whether the algorithm is actually producing the expected output or not.
Code coverage should also be brought as close to 100% as possible, before the classes are contributed to the toolkit. The reason is that the relevance of a test passing is only significant at the level of the code coverage of the test. In other words, a test that passes but that only exercises 20% of the code in a class can not be claimed to be a suficient demostration of the implementation correctness. Such test will only prove that 20% of the class works as advertised.
Failure to provide sufficient code coverage in the initial commit of a class is the most common cause for bugs getting undetected in the toolkit for long periods. It is also the most common case for classes to break without being noticed when other changes in the code affect the untested sections.
Lack of code coverage breaks the basic assumptions of the quality control system based on the Dart Dashboard.
Maximize maintainability
Once any class is included in the toolkit, the developers community gets engaged in maintaining this code for as long as the Toolkit is available. This can easily mean five to ten years of software maintenance. It is a well known fact that 80% of the cost and effort of software development is spent in maintenance and bug fixes. The bulk of this maintenance effort is the time spend by future developers in understanding
Therefore it is quite important for any class contributed to the toolkit to be analyzed for maintainability.
Among the most important criteria for maintainabilu
Ensure that new algorithms are properly documented
Proper documentation of new algorithms is key for encouraging their use by the community. It is quite common for users in the mailing list to post questions regarding a paper where a particular ITK class is described.
Given that many of the algorithms may have been described already in published papers, the new classes may simply cite those papers.
The documentation of a new algorithm should also include guidance on how to use it. In particular, practical examples, with realisting data input are the ideal way of presenting the algorithm usage to the community.
Gather feedback from the community
The Insight Journal uses an open public peer-review system. It is then possible for anybody in the community to contribute reviews for the articles posted in the Journal. This open channel allows users and developers to share information about the papers in the Journal. In particular, it facilitates to send corrections, and suggestions for improvements, that the authors can use for improving their work (source code and documents) and submit subsequent versions of their contributions.
Hold a continuously open forum
Given that the reviews are non-anonymous and public, authors are free to have a two-way communication with the reviewers and constructibly discuss the details of the proposed algorithms. This dialog gets recorded in the form of reviews and replies to reviews and is also shared with the community. Readers of the papers can benefit from reading these dialogs since they will give them insight on the issues that may be raised by the article's content.
The Procedure
Life Cycle of a Submission
The procedure for contributing new classes and algorithms to [ITK] is the following.
- An Author will propose an algorithm to the developers list or to the weekly tcon.
This will be an initial check to make sure that the algorithm is not already available in ITK, or that it can not be constructed with components already existing on the toolkit. - The Author will prepare a working prototype of source code and will tested with realistic data.
- The Author will submit a paper to the Insight Journal.
The paper must include the following- The source code of the prototype
- The source code of the test
- The realistic input data required by the test
- The full list of parameters required by the test
- The output data produced by the test
- A document (preferably a hyperlinked PDF) describing the algorithm and how to use the new classes
- The source code of the prototype and it test will be automatically compiled an executed by the testing system of the Insight Journal.
- The paper and its source code will receive reviews from the community
- Every week, at the tcon, the oversight committee will select papers from the Journal and assign developers for moving the code into the CVS repository.
- The assignment will be added as a Feature Request in the bug tracker in order to ensure that it gets checked before the following release of the Toolkit.
- Once the code is in the ITK repository, further improvements to the code will be accompanied by short papers to the Insight Journal. The need for these papers will be limited to algorithmic improvements.
It must be noted that papers to the Insight Journal are not the typical burdersome papers expected by the traditional Journals. Instead, these papers are a kind of technical report addressed to future developers, maintainers and users of the code. The goal of the papers is to provide enough technical information for making possible to use the algorithms, and to maintain their code in the years to come. The papers will be focused on the reproducibility of the test, and in instructing users on how to adapt the algorithm parameters to other data scenarios.
The ITK Editorial Committee
ITK Developers will play the role of editors for the Journal in particular topics. In this role they will make sure that a paper that falls into their subject of competence gets reviewed and move through the process described above. The following is the list of current editors and their subjects.
Subject Matter | Editor | Affiliation |
---|---|---|
Registration | Daniel Blezeck | GE |
Registration | Luis Ibanez | Kitware |
LevelSets | Jim Miller | GE |
Mathematical Morphology | Jim Miller | GE |
Meshes | Sylvain Jaume | Kitware |
Editors will also be in charge of specific areas of the toolkit, according to the subdirectory organization.
Current areas and their editors are listed following table.
Toolkit Area | Editor | Affiliation |
---|---|---|
Common | Luis Ibanez | Kitware |
BasicFilters | Daniel Blezek | GE |
Algorithms | Jim Miller | GE |
Statiscs | Stephen Aylward | Kitware |
Spatial Objects | Julien Jomier | Kitware |
Wrapping | Brad King | Kitware |
DICOM | Mathieu Malaterre | Kitware |
The Automatic Testing System of the Insight Journal
The infrastructure of the Insight Journal will automatically test the source code of a submitted paper.
A full description of the testing environment is described in the following link
http://www.insightsoftwareconsortium.org/wiki/index.php/IJ-Testing-Environment
How to Prepare a Submission to the Insight Journal
The full description of the process on how to prepare a submission to the Insight Journal can be found at
http://www.insightsoftwareconsortium.org/wiki/index.php/CMake_Tutorial
Templates for papers and CMakeLists files are available in this link.