Qi Wei and Dinesh K. Pai
University of British Columnbia and Rutgers University
Model Description
![]() Graphical interpretation of statistical ultrasound image restoration model in [Husby et al. 2001]. |
Fixed hyper-parameters are represented with boxes. Observed variable is represented with a double circle. Unknown variables are represented with circles. Model parameters but not hyper-parameters, are represented in double boxes. h is the point spread function and is estimated on the CPU for each image, using modified Homomorphic transformation method in [Tax 1995]. The reflectance and variance variables are estimated alternatively on the GPU. |
Parallel fragment processor on GPUs are more suitable to perform image processing tasks compared to conventional CPUs, which may suffer from limit of memory bandwidth and caching. The graphs belows illustrate high-level image processing on the two different architectures.
|
CPU implements iterative image processing algorithm sequentially. One pixel is computed inside the loop. |
GPU can process multiple fragments/pixels in parallel in one pass due to its parallel architecture. |
Results
Using four color packing idea, convolution is performed effiently by the "dot" operation in fragment processor with four point spread functions. By doing this, convolution with large kernels can be computed efficiently. The following graph compares the computation time of convolution performed on a 128X128 image, by different implementations on different graphics cards.

|
Digital picture of the phantom, from which ultrasound images are acquired. |
Ultrasound Image after interpolation, shown on the screen of an ultrasound machine. |
Raw data ultrasound image, before interpolation. A region of size 128X128 is selected. |
Below are some resulting images showing the variance field after different numbers of iterations we got from processing the region of interest, which is marked out in the right image above. After updating reflectance and variance alternatively for 20 iterations, only the variance continues to be computed using the resulting reflectance from 20th iteration.
|
After 500 iterations |
After 1500 iterations |
After 3000 iterations |
After 5000 iterations |
After 7000 iterations |
The following tables show the speed performance of our GPU implementation on this 128X128 image. The efficient CPU implemenation of the Husby's model takes 1.09 second per interation.
|
Stage
|
Running
Time (in milliseconds)
|
|
Update reflectance
|
6.625
|
|
Update Variance
|
4.642
|
|
Total per
iterations
|
11.267
|
Reference
Remarks
This work is described in:
Last Modified: July 19, 2004