User:Jadrian Miles/Streamline clustering: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
No edit summary |
||
| Line 12: | Line 12: | ||
#* A curve's distance to a cluster is the minimum distance to any curve in that cluster. | #* A curve's distance to a cluster is the minimum distance to any curve in that cluster. | ||
#* In each iteration, with lowest minimum curve-to-curve distance to its closest cluster. | #* In each iteration, with lowest minimum curve-to-curve distance to its closest cluster. | ||
== Code == | |||
=== Distance Measurements === | |||
* Core functions | |||
** <tt>dcc.m</tt> computes a single, symmetric distance between two curves, using <tt>pdcc.m</tt>. | |||
** <tt>pdcc.m</tt> computes a set of point-to-point distances between two curves, with a few customizable options. | |||
** <tt>dpc.m</tt> computes the distance from a single point to a curve. | |||
* Helpers | |||
** <tt>followCurve.m</tt> gives the point at a specified fractional index along a curve. | |||
** <tt>distOnCurve.m</tt> gives the distance along a curve between two fractional indices. | |||
* Graphics | |||
** <tt>drawpdcc.m</tt> plots two curves and the point-to-point matches found on them by <tt>pdcc.m</tt>. | |||
** <tt>drawpdccset.m</tt> plots all four variations of the asymmetric distance for two curves. | |||
Revision as of 23:05, 26 March 2009
Tubegen generates an easy-to-parse .nocr file specifying points on streamlines.
- Pick a good dataset (Diffusion_MRI#Collaboration_Table) -- $G/data/diffusion/brown3t/cohen_hiv_study_registered.2007.02.07/patient120
- Run tubegen on it with modified parameters so it doesn't cull anything---this will result in ~100k curves, with an average of ~70 points per curve.
- Write a python script to divide the computation of the curve-to-curve distance matrix among many computers.
- Try max and mean minimum point-to-curve distance in overlapping region as inter-curve distance measure. Or exponentially weighted mean a la cad.
- See also cad's /map/gfx0/tools/linux/src/embed/utils/fast_distance_computing/src/ICurveDist/test
- The per-curve script should return the assigned matrix line as well as a list of curves sorted by distance and annotated with the distance, for fast clustering.
- After computing the upper half of the matrix, create an ordered list of curve-to-curve distances annotated with the curve pairs. Distributed w:quicksort? [1]
- Try max and mean minimum point-to-curve distance in overlapping region as inter-curve distance measure. Or exponentially weighted mean a la cad.
- Build up clusters until some termination condition: satisfactory number of non-singleton clusters, satisfactory median size of non-singleton clusters, etc. Or just run until you get one huge cluster, but store the binary cluster tree. It may be really skewed but maybe a tree rebalancing algorithm could help in post-processing.
- Initialization: each curve is a singleton cluster.
- A curve's distance to a cluster is the minimum distance to any curve in that cluster.
- In each iteration, with lowest minimum curve-to-curve distance to its closest cluster.
Code
Distance Measurements
- Core functions
- dcc.m computes a single, symmetric distance between two curves, using pdcc.m.
- pdcc.m computes a set of point-to-point distances between two curves, with a few customizable options.
- dpc.m computes the distance from a single point to a curve.
- Helpers
- followCurve.m gives the point at a specified fractional index along a curve.
- distOnCurve.m gives the distance along a curve between two fractional indices.
- Graphics
- drawpdcc.m plots two curves and the point-to-point matches found on them by pdcc.m.
- drawpdccset.m plots all four variations of the asymmetric distance for two curves.