MRI Repository/Design Decisions

From VrlWiki
Jump to navigation Jump to search

This page is being used to work through open issues in the MRI Repository

   Legend:  ? An open issue or an option.
            + A benefit
            - A detriment
  • Compression
? Compress on the disk
- Overhead to instrument all software with decompressors.
? Compressed files may be pre-staged for downloading.
? Compression on download
? Compression mechanism
? Support multiple compressors
? Support one compressor


  • The database should be able to accommodate something around that scale: 50GB baseline, with the possibility to expand by an order of magnitude.


  • fresh-off-the-scanner data must be made available.
  • Coregistered and even atlas-registered data, following Win's pipeline.
? Any choices we make about further derived forms of the data should be applied to all datasets --- that is, if we offer coregistered data for one patient, we must offer coregistered data for all patients.
? Registered data should be accompanied by its registration matrices.


  • Searchable
  • Sliceable
A user should be able to get all scans for a certain patient, or a certain age range, or a certain disease state. We will likely be posting other types of scans besides diffusion MRI (for example, T1-weighted MRI), so the type of scan would be another search index.
  • Web design
We would be hard-pressed to design a static website that allows searching and slicing.
There is still a ban on dynamic web content on the main public CS servers.
? We could set up a public-facing virtual web server, like vrl.cs.brown.edu, perhaps negotiate some exception
? Design a complex set of static pages that would be automatically updated every time new data are added.
Use appropriate web languages that have web content generation libraries.
? PHP
? Python
? Perl
- Perl is a hack
? Java
  • Data management
MRI's are currently in: /data/graphics/mri/*/
? Other MIR's
I foresee some sort of unique ID for each scan (it need not be an obfuscated one!
? A standard format like <patient ID>_<scan date>_<scan type>_<scan repetition>
? A flat pool of files named according to the ID, and an SQL database that matches each ID to its values for the various search indices.
The website would ideally be a pleasantly-designed interface to this database.