|
|
(6 intermediate revisions by 2 users not shown) |
Line 1: |
Line 1: |
| This page documents how to add test data while developing ITK.
| | {{ Historical }} |
| See our [[ITK/Git|table of contents]] for more information.
| |
| __TOC__
| |
| = Setup =
| |
|
| |
|
| The workflow below depends on local hooks to function properly.
| | Up to date information can be found on GitHub: |
| Follow the main [[ITK/Git/Develop#Setup|developer setup instructions]] before proceeding.
| | https://github.com/InsightSoftwareConsortium/ITK/blob/master/Documentation/Data.md |
| In particular, run
| |
| [http://itk.org/gitweb?p=ITK.git;a=blob;f=Utilities/SetupForDevelopment.sh;hb=HEAD <code>SetupForDevelopment.sh</code>]:
| |
| | |
| $ ./Utilities/SetupForDevelopment.sh
| |
| | |
| ''The set script was last modified for this workflow on '''May 9, 2011'''. Be sure to run it in your work tree with a checkout more recent than that.''
| |
| | |
| = Workflow =
| |
| | |
| Our workflow for adding data integrates with our standard Git [[ITK/Git/Develop|development process]].
| |
| Start by [[ITK/Git/Develop#Create_a_Topic|creating a topic]].
| |
| Return here when you reach the "edit files" step.
| |
| | |
| These instructions follow a typical use case of adding a new test with a baseline image.
| |
| | |
| == Add Data ==
| |
| | |
| {| style="width: 100%"
| |
| |-
| |
| |width=60%|
| |
| Copy the data file into your local source tree.
| |
| |-
| |
| |
| |
| :<code>$ mkdir -p Modules/.../test/Baseline</code>
| |
| :<code>$ cp ~/''MyTest.png'' Modules/.../test/Baseline/''MyTest.png''</code>
| |
| |
| |
| |}
| |
| | |
| == Add Test ==
| |
| | |
| {| style="width: 100%"
| |
| |-
| |
| |width=60%|
| |
| Edit the test CMakeLists.txt file and reference the data file in an <code>itk_add_test</code> call.
| |
| Specify the file inside <code>DATA{...}</code> using a path relative to the test directory:
| |
| :<code>$ edit Modules/.../test/CMakeLists.txt</code>
| |
| :{|
| |
| |
| |
| itk_add_test(NAME MyTest COMMAND ... --compare DATA{Baseline/''MyTest.png''} ...)
| |
| |}
| |
| Files in <code>Testing/Data</code> may be referenced as <code>DATA{${ITK_DATA_ROOT}/Input/''MyInput.png''}</code>.
| |
| If the data file references other data files, e.g. <code>.mhd -> .raw</code>, follow the link to the ExternalData module on the right and read the documentation on "associated" files.
| |
| |align="center"|
| |
| [http://itk.org/gitweb?p=ITK.git;a=blob;f=CMake/ExternalData.cmake;hb=HEAD <code>ExternalData.cmake</code>]
| |
| |}
| |
| | |
| == Run CMake ==
| |
| | |
| {| style="width: 100%"
| |
| |-
| |
| |width=60%|
| |
| ''CMake will [[#ExternalData|move the original file]]. Keep your own copy if necessary.''
| |
| | |
| Run cmake on the build tree:
| |
| :<code>$ cd ../ITK-build</code> | |
| :<code>$ cmake .</code>
| |
| :''(Or just run "make" to do a full configuration and build.)''
| |
| :<code>$ cd ../ITK</code>
| |
| |align="center"|
| |
| [[#Recover_Data_File|Need to recover the original file]]?
| |
| |-
| |
| |
| |
| During configuration CMake will display a message such as:
| |
| :{|
| |
| |
| |
| Linked Modules/.../test/Baseline/''MyTest.png''.md5 to ExternalData MD5/...
| |
| |}
| |
| This means that CMake converted the file into a data object referenced by a "content link".
| |
| |align="center"|
| |
| [http://itk.org/gitweb?p=ITK.git;a=blob;f=CMake/ExternalData.cmake;hb=HEAD <code>ExternalData.cmake</code>]
| |
| |}
| |
| | |
| == Commit ==
| |
| | |
| {| style="width: 100%"
| |
| |-
| |
| |width=60%|
| |
| Continue to [[ITK/Git/Develop#Create_a_Topic|create the topic]] and edit other files as necessary.
| |
| Add the content link and commit it along with the other changes:
| |
| :<code>$ git add Modules/.../test/Baseline/''MyTest.png''.md5</code>
| |
| :<code>$ git add Modules/.../test/CMakeLists.txt</code>
| |
| :<code>$ git commit</code>
| |
| |align="center"|
| |
| [http://www.kernel.org/pub/software/scm/git/docs/git-add.html <code>git help add</code>]
| |
| <br/>
| |
| [http://www.kernel.org/pub/software/scm/git/docs/git-commit.html <code>git help commit</code>]
| |
| |-
| |
| |
| |
| The local <code>pre-commit</code> hook will display a message such as:
| |
| :{|
| |
| |
| |
| Modules/.../test/Baseline/''MyTest.png''.md5: Added content to Git at refs/data/MD5/...
| |
| Modules/.../test/Baseline/''MyTest.png''.md5: Added content to local store at .ExternalData/MD5/...
| |
| Content link Modules/.../test/Baseline/''MyTest.png''.md5 -> .ExternalData/MD5/...
| |
| |}
| |
| This means that the pre-commit hook recognized that the content link references a new data object and [[#pre-commit|prepared it for upload]].
| |
| |align="center"|
| |
| [http://itk.org/gitweb?p=ITK.git;a=blob;f=Utilities/Hooks/pre-commit;hb=HEAD <code>pre-commit</code>]
| |
| |}
| |
| | |
| == Push ==
| |
| {| style="width: 100%"
| |
| |-
| |
| |width=60%|
| |
| Follow the instructions to [[ITK/Git/Develop#Share_a_Topic|share the topic]].
| |
| When you push it to Gerrit for review using
| |
| :<code>$ git gerrit-push</code>
| |
| |align="center"|
| |
| [http://itk.org/gitweb?p=ITK.git;a=blob;f=Utilities/Git/git-gerrit-push;hb=HEAD <code>git-gerrit-push</code>]
| |
| |-
| |
| |
| |
| part of the output will be of the form
| |
| :{|
| |
| |
| |
| * ...:refs/data/commits/... [new branch]
| |
| * HEAD:refs/for/master/''my-topic'' [new branch]
| |
| Pushed refs/data and removed local copy:
| |
| MD5/...
| |
| |}
| |
| This means that the git-gerrit-push script pushed the topic and [[#git-gerrit-push|uploaded the data]] it references.
| |
| |}
| |
| | |
| = Building =
| |
| | |
| == Download ==
| |
| {| style="width: 100%"
| |
| |-
| |
| |width=60%|
| |
| For the test data to be downloaded and made available to the tests in your build tree the <code>ITKData</code> target must be built.
| |
| One may build the target directly, e.g. <code>make ITKData</code>, to obtain the data without a complete build.
| |
| The output will be something like
| |
| :{|
| |
| |
| |
| -- Fetching ".../ExternalData/MD5/..."
| |
| -- [download 100% complete]
| |
| -- Downloaded object: "''ITK-build''/ExternalData/Objects/MD5/..."
| |
| |}
| |
| |-
| |
| |
| |
| The downloaded files appear in <code>''ITK-build''/ExternalData</code> by default.
| |
| |
| |
| |}
| |
| | |
| == Local Store ==
| |
| {| style="width: 100%"
| |
| |-
| |
| |width=60%|
| |
| It is possible to configure one or more local ExternalData object stores shared among multiple builds.
| |
| Configure for each build the advanced cache entry <code>ExternalData_OBJECT_STORES</code> to a directory on your local disk outside all build trees, e.g. "<code>/home/user/.ExternalData</code>":
| |
| :<code>$ cmake -DExternalData_OBJECT_STORES=/home/user/.ExternalData ../ITK</code>
| |
| The ExternalData module will store downloaded objects in the local store instead of the build tree.
| |
| Once an object has been downloaded by one build it will persist in the local store for re-use by other builds without downloading again.
| |
| |align="center"|
| |
| [http://itk.org/gitweb?p=ITK.git;a=blob;f=CMake/ExternalData.cmake;hb=HEAD <code>ExternalData.cmake</code>]
| |
| |}
| |
| | |
| = Discussion =
| |
| | |
| An ITK test data file is not stored in the main source tree under version control.
| |
| Instead the source tree contains a "content link" that refers to a data object by a hash of its content.
| |
| At build time the the
| |
| [http://itk.org/gitweb?p=ITK.git;a=blob;f=CMake/ExternalData.cmake;hb=HEAD <code>ExternalData.cmake</code>]
| |
| module fetches data needed by enabled tests.
| |
| This allows arbitrarily large data to be added and removed without bloating the version control history.
| |
| | |
| The above [[#Workflow|workflow]] allows developers to add a new data file almost as if committing it to the source tree.
| |
| The following subsections discuss details of the workflow implementation.
| |
| | |
| == ExternalData ==
| |
| | |
| While [[#Run_CMake|CMake runs]] the
| |
| [http://itk.org/gitweb?p=ITK.git;a=blob;f=CMake/ExternalData.cmake;hb=HEAD ExternalData]
| |
| module evaluates [[#Add_Test|DATA{} references]].
| |
| ITK [http://itk.org/gitweb?p=ITK.git;a=blob;f=CMake/ITKExternalData.cmake;hb=HEAD sets]
| |
| the <code>ExternalData_LINK_CONTENT</code> option to <code>MD5</code> to enable automatic conversion of raw data files into content links.
| |
| When the module detects a real data file in the source tree it performs the following transformation as specified in the module documentation:
| |
| * Compute the MD5 hash of the file
| |
| * Store the <code>${hash}</code> in a file with the original name plus <code>.md5</code>
| |
| * Rename the original file to <code>.ExternalData_MD5_${hash}</code>
| |
| The real data now sit in a file that we [http://itk.org/gitweb?p=ITK.git;a=blob;f=.gitignore;hb=HEAD tell Git to ignore].
| |
| For example:
| |
| | |
| $ '''cat Modules/.../test/Baseline/.ExternalData_MD5_477e602800c18624d9bc7a32fa706b97 |md5sum'''
| |
| 477e602800c18624d9bc7a32fa706b97 -
| |
| $ '''cat Modules/.../test/Baseline/''MyTest.png''.md5'''
| |
| 477e602800c18624d9bc7a32fa706b97
| |
| | |
| === Recover Data File ===
| |
| | |
| To recover the original file after running CMake but before committing, undo the operation:
| |
| | |
| $ '''cd Modules/.../test/Baseline'''
| |
| $ '''mv .ExternalData_MD5_$(cat MyTest.png.md5) MyTest.png'''
| |
| | |
| == pre-commit ==
| |
| | |
| While [[#Commit|committing]] a new or modified content link the
| |
| [http://itk.org/gitweb?p=ITK.git;a=blob;f=Utilities/Hooks/pre-commit;hb=HEAD <code>pre-commit</code>]
| |
| hook moves the real data object from the <code>.ExternalData_MD5_${hash}</code> file left by the ExternalData module
| |
| to a local object repository stored in a <code>.ExternalData</code> directory at the top of the source tree.
| |
| | |
| The hook also uses Git plumbing commands to store the data object as a blob in the local Git repository.
| |
| The blob is not referenced by the new commit but instead by <code>refs/data/MD5/${hash}</code>.
| |
| This keeps the blob alive in the local repository but does not add it to the project history.
| |
| For example:
| |
| $ '''git for-each-ref --format="%(refname)" refs/data'''
| |
| refs/data/MD5/477e602800c18624d9bc7a32fa706b97
| |
| $ '''git cat-file blob refs/data/MD5/477e602800c18624d9bc7a32fa706b97 | md5sum'''
| |
| 477e602800c18624d9bc7a32fa706b97 -
| |
| | |
| == git gerrit-push ==
| |
| | |
| The "<code>git gerrit-push</code>" command is actually an
| |
| [http://itk.org/gitweb?p=ITK.git;a=blob;f=Utilities/DevelopmentSetupScripts/SetupGitAliases.sh;hb=HEAD alias]
| |
| for the
| |
| [http://itk.org/gitweb?p=ITK.git;a=blob;f=Utilities/Git/git-gerrit-push;hb=HEAD <code>Utilities/Git/git-gerrit-push</code>]
| |
| script.
| |
| In addition to pushing the topic branch to Gerrit the script also detects content links added or modified by the commits in the topic.
| |
| It reads the data object hashes from the content links and looks for matching <code>refs/data/</code> entries in the local Git repository.
| |
| | |
| The script pushes the matching data objects to Gerrit inside a temporary commit object disjoint from the rest of history.
| |
| For example:
| |
| | |
| $ '''git gerrit-push --dry-run --no-topic'''
| |
| * f59717cfb68a7093010d18b84e8a9a90b6b42c11:refs/data/commits/f59717cfb68a7093010d18b84e8a9a90b6b42c11 [new branch]
| |
| Pushed refs/data and removed local copy:
| |
| MD5/477e602800c18624d9bc7a32fa706b97
| |
| $ '''git ls-tree -r --name-only f59717cf'''
| |
| MD5/477e602800c18624d9bc7a32fa706b97
| |
| $ '''git log --oneline f59717cf'''
| |
| f59717c data
| |
| | |
| A robot runs every few minutes to fetch the objects from Gerrit and upload them to a
| |
| [http://www.itk.org/files/ExternalData location] that we
| |
| [http://itk.org/gitweb?p=ITK.git;a=blob;f=CMake/ITKExternalData.cmake;hb=HEAD tell ExternalData to search]
| |
| at build time.
| |