MRI Repository/Design Decisions: Difference between revisions

From VrlWiki
Jump to navigation Jump to search
New page: This page is being used to work through open issues in the MRI Repository * Compression * File Using (losslessly) compressed NIfTI files, a single scan of an individual totals a little u...
 
No edit summary
Line 1: Line 1:
This page is being used to work through open issues in the MRI Repository
This page is being used to work through open issues in the MRI Repository
    Legend:  ? An open issue or an option.
            + A benefit
            - A detriment


* Compression
* Compression


* File Using (losslessly) compressed NIfTI files, a single scan of an individual
: ? Compress on the disk
totals a little under 90MB. If we were instead to post DICOMs, I think we
:: - Overhead to instrument all software with decompressors.
would end up closer to 350MB per scan but that could also be compressed at a
:: ? Compressed files may be pre-staged for downloading.
pretty great ratio. Offering both formats would be ideal; with compressed
 
DICOMs, I think the total size per scan would be about 275MB, plus a little
: ? Compression on download
extra for metadata.  Thus far we have on the order of 200 patients scanned from
 
various studies, and so the database should be able to accommodate something
: ? Compression mechanism
around that scale: 50GB baseline, with the possibility to expand by an order of
:: ? Support multiple compressors
magnitude.
:: ? Support one compressor
 
 
* The database should be able to accommodate something around that scale: 50GB baseline, with the possibility to expand by an order of magnitude.
 
 
* fresh-off-the-scanner data must be made available.
 
* Coregistered and even atlas-registered data, following Win's pipeline.
 
: ? Any choices we make about further derived forms of the data should be applied to all datasets --- that is, if we offer coregistered data for one patient, we must offer coregistered data for all patients.
 
: ? Registered data should be accompanied by its registration matrices.
 
 
* Searchable
 
* Sliceable
 
: A user should be able to get all scans for a certain patient, or a certain age range, or a certain disease state.  We will likely be posting other types of scans besides diffusion MRI (for example, T1-weighted MRI), so the type of scan would be another search index.
 
* Web design
 
: We would be hard-pressed to design a static website that allows searching and slicing.
 
: There is still a ban on dynamic web content on the main public CS servers.
:: ? We could set up a public-facing virtual web server, like vrl.cs.brown.edu, perhaps negotiate some exception
 
:: ? Design a complex set of static pages that would be automatically updated every time new data are added.
 
: Use appropriate web languages that have web content generation libraries.
:: ? PHP
:: ? Python
:: ? Perl
:: ? Java
 
* Data management
 
: I foresee some sort of unique ID for each scan (it need not be an obfuscated one!


It would be nice to offer coregistered and even atlas-registered data,
:: ? A standard format like <patient ID>_<scan date>_<scan type>_<scan repetition>
following Win's pipeline, but there are tradeoffs to that.  Whatever happens,
the untouched fresh-off-the-scanner data must be made available.  Any choices
we make about further derived forms of the data should be applied to all
datasets --- that is, if we offer coregistered data for one patient, we must
offer coregistered data for all patients.  Any registered data should be
accompanied by its registration matrices.


The database should be searchable and sliceable --- that is, a user should be
:: ? A flat pool of files named according to the ID, and an SQL database that matches each ID to its values for the various search indices.
able to get all scans for a certain patient, or a certain age range, or a
certain disease state.  We will likely be posting other types of scans besides
diffusion MRI (for example, T1-weighted MRI), so the type of scan would be
another search index.


We would be hard-pressed to design a static website that meets these
: The website would ideally be a pleasantly-designed interface to this database.

Revision as of 14:58, 8 July 2010

This page is being used to work through open issues in the MRI Repository

   Legend:  ? An open issue or an option.
            + A benefit
            - A detriment
  • Compression
? Compress on the disk
- Overhead to instrument all software with decompressors.
? Compressed files may be pre-staged for downloading.
? Compression on download
? Compression mechanism
? Support multiple compressors
? Support one compressor


  • The database should be able to accommodate something around that scale: 50GB baseline, with the possibility to expand by an order of magnitude.


  • fresh-off-the-scanner data must be made available.
  • Coregistered and even atlas-registered data, following Win's pipeline.
? Any choices we make about further derived forms of the data should be applied to all datasets --- that is, if we offer coregistered data for one patient, we must offer coregistered data for all patients.
? Registered data should be accompanied by its registration matrices.


  • Searchable
  • Sliceable
A user should be able to get all scans for a certain patient, or a certain age range, or a certain disease state. We will likely be posting other types of scans besides diffusion MRI (for example, T1-weighted MRI), so the type of scan would be another search index.
  • Web design
We would be hard-pressed to design a static website that allows searching and slicing.
There is still a ban on dynamic web content on the main public CS servers.
? We could set up a public-facing virtual web server, like vrl.cs.brown.edu, perhaps negotiate some exception
? Design a complex set of static pages that would be automatically updated every time new data are added.
Use appropriate web languages that have web content generation libraries.
? PHP
? Python
? Perl
? Java
  • Data management
I foresee some sort of unique ID for each scan (it need not be an obfuscated one!
? A standard format like <patient ID>_<scan date>_<scan type>_<scan repetition>
? A flat pool of files named according to the ID, and an SQL database that matches each ID to its values for the various search indices.
The website would ideally be a pleasantly-designed interface to this database.