Untitled

 avatar
unknown
plain_text
2 years ago
13 kB
14
Indexable
Hi Anders,

Thank you for your swift response. I've made progress and managed to compile the C code and run the MATLAB code. Here's a detailed rundown of what I did:

### System & Software Configuration:

- **MATLAB Version**: 2021b
- **CUDA Version**: 11.7.0
- **NVIDIA Driver Version**: 515.43.04
- **GCC Version**: 10.3.0

###  Compilation of MCmatlab.c:

```
mex COPTIMFLAGS='$COPTIMFLAGS -Ofast -fopenmp -std=c11 -Wall' LDOPTIMFLAGS='$LDOPTIMFLAGS -Ofast -fopenmp -std=c11 -Wall' -outdir +MCmatlab/@model/private ./+MCmatlab/src/MCmatlab.c -lut
mex COPTIMFLAGS='$COPTIMFLAGS -Ofast -std=c11 -Wall' LDOPTIMFLAGS='$LDOPTIMFLAGS -Ofast -std=c11 -Wall' -outdir +MCmatlab/@model/private -output MCmatlab_singlethreaded ./+MCmatlab/src/MCmatlab.c -lut
```

### Compilation of finiteElementHeatPropagator.c:

```
mex COPTIMFLAGS='$COPTIMFLAGS -Ofast -fopenmp -std=c11 -Wall' LDOPTIMFLAGS='$LDOPTIMFLAGS -Ofast -fopenmp -std=c11 -Wall' -outdir +MCmatlab/@model/private ./+MCmatlab/src/finiteElementHeatPropagator.c -lut
mex COPTIMFLAGS='$COPTIMFLAGS -Ofast -std=c11 -Wall' LDOPTIMFLAGS='$LDOPTIMFLAGS -Ofast -std=c11 -Wall' -outdir +MCmatlab/@model/private -output finiteElementHeatPropagator_singlethreaded ./+MCmatlab/src/finiteElementHeatPropagator.c -lut
```

### Compilation of getDownsampledParamVals.c:

```
mex COPTIMFLAGS='$COPTIMFLAGS -Ofast -fopenmp -std=c11 -Wall' LDOPTIMFLAGS='$LDOPTIMFLAGS -Ofast -fopenmp -std=c11 -Wall' -outdir +MCmatlab/@model/private ./+MCmatlab/src/getDownsampledParamVals.c
```

### Preparation for CUDA Compilation:

```
cp ./+MCmatlab/src/MCmatlab.c ./+MCmatlab/src/MCmatlab_CUDA.cu
cp ./+MCmatlab/src/finiteElementHeatPropagator.c ./+MCmatlab/src/finiteElementHeatPropagator_CUDA.cu
```

### Compilation Using mexcuda:

Initially, using mexcuda directly in the command line did not work. So, I opened MATLAB via the command line and compiled with:

```
mexcuda COMPFLAGS='-use_fast_math -res-usage $COMPFLAGS' -outdir +MCmatlab/@model/private ./+MCmatlab/src/MCmatlab_CUDA.cu
mexcuda COMPFLAGS='-use_fast_math -res-usage $COMPFLAGS' -outdir +MCmatlab/@model/private ./+MCmatlab/src/finiteElementHeatPropagator_CUDA.cu
```

Note: I had to remove the -llibut flag and commented out the utIsInterruptPending function in the .cu files. Since it's used to catch ctrl+c during MEX execution, I believe it's not crucial.

### Performance Testing:

#### Without GPU:

```
function fluence = main_mc()
%% Geometry definition
MCmatlab.closeMCmatlabFigures();
model = MCmatlab.model;

model.G.nx                = 200; % Number of bins in the x direction
model.G.ny                = 200; % Number of bins in the y direction
model.G.nz                = 30; % Number of bins in the z direction  %10
model.G.Lx                = 1; % [cm] x size of simulation cuboid
model.G.Ly                = 1; % [cm] y size of simulation cuboid
model.G.Lz                = 0.15; % [cm] z size of simulation cuboid  %0.1

model.G.mediaPropertiesFunc = @mediaPropertiesFunc; % Media properties defined as a function at the end of this file
model.G.geomFunc            = @geometryDefinition; % Function to use for defining the distribution of media in the cuboid. Defined at the end of this m file.

%model = plot(model,'G');

% X = model.G.x;
% Y = model.G.y;
% Z = model.G.z;

%% Top-hat focal plane, Gaussian angular intensity distribution
model.MC.nPhotonsRequested        = 4*5e6; % # of photons to launch

% model.MC.nExamplePaths            = 20;

model.MC.matchedInterfaces        = true; % Assumes all refractive indices are the same
model.MC.boundaryType             = 2;    % 0: No escaping boundaries, 1: All cuboid boundaries are escaping, 2: Top cuboid boundary only is escaping, 3: Top and bottom boundaries are escaping, while the side boundaries are cyclic

w = 400%:1:700;
for i=1%:3%:length(w)
model.MC.wavelength               = w(i);
model.MC.lightSource.xFocus       = 0; % [cm] x position of focus
model.MC.lightSource.yFocus       = 0; % [cm] y position of focus
model.MC.lightSource.zFocus       = 0; % [cm] z position of focus

model.MC.lightSource.sourceType   = 4; % 0: Pencil beam, 1: Isotropically emitting line or point source, 2: Infinite plane wave, 3: Laguerre-Gaussian LG01 beam, 4: Radial-factorizable beam (e.g., a Gaussian beam), 5: X/Y factorizable beam (e.g., a rectangular LED emitter)

model.MC.lightSource.focalPlaneIntensityDistribution.radialDistr = 0; % Radial focal plane intensity distribution - 0: Top-hat, 1: Gaussian, Array: Custom. Doesn't need to be normalized.
model.MC.lightSource.focalPlaneIntensityDistribution.radialWidth = 1.2; % [cm] Radial focal plane 1/e^2 radius if top-hat or Gaussian or half-width of the full distribution if custom
model.MC.lightSource.angularIntensityDistribution.radialDistr = 2; % Radial angular intensity distribution - 0: Top-hat, 1: Gaussian, 2: Cosine (Lambertian), Array: Custom. Doesn't need to be normalized.
model.MC.lightSource.angularIntensityDistribution.radialWidth = 0; %pi/6; % [rad] Radial angular 1/e^2 half-angle if top-hat or Gaussian or half-angle of the full distribution if custom. For a diffraction limited Gaussian beam, this should be set to model.MC.wavelength*1e-9/(pi*model.MC.lightSource.focalPlaneIntensityDistribution.radialWidth*1e-2))
model.MC.lightSource.theta = 0; % [rad] Polar angle of beam center axis         = 0;  %pi/2; % [rad] Azimuthal angle of light source center axis

% Collector
model.MC.useLightCollector        = true;
model.MC.lightCollector.x         = 0;   %0; % [cm] x position of either the center of the objective lens focal plane or the fiber tip
model.MC.lightCollector.y         = 0;   %-0.005; % [cm] y position
model.MC.lightCollector.z         = 0; % [cm] z position

model.MC.lightCollector.theta     = 0; % [rad] Polar angle of direction the light collector is facing
model.MC.lightCollector.phi       = 0; %pi/2; % [rad] Azimuthal angle of direction the light collector is facing

model.MC.lightCollector.f         = 0.0001; % [cm] Focal length of the objective lens (if light collector is a fiber, set this to Inf).
model.MC.lightCollector.diam      = sqrt(model.G.Lx^2 + model.G.Ly^2);   % [cm] Diameter of the light collector aperture. For an ideal thin lens, this is 2*f*tan(asin(NA)).
model.MC.lightCollector.fieldSize = sqrt(model.G.Lx^2 + model.G.Ly^2);   % [cm] Field Size of the imaging system (diameter of area in object plane that gets imaged). Only used for finite f.

model.MC.lightCollector.NA        = 0.22; % [-] Fiber NA. Only used for infinite f.
model.MC.lightCollector.res       = 200;   % X and Y resolution of light collector in pixels, only used for finite f
model.MC.depositionCriteria.onlyCollected = true;


%CPU
model.MC.useGPU                   = false; % (Default: false) Use CUDA acceleration for NVIDIA GPUs
% model = runMonteCarlo(model);

%GPU
%model.MC.useGPU                   = true;  % (Default: false) Use CUDA acceleration for NVIDIA GPUs
%model.MC.GPUdevice                = 0;     % (Default: 0, the first GPU) The index of the GPU device to use for the simulation
tic
model = runMonteCarlo(model);
toc

fluencee = model.MC.normalizedFluenceRate;
fluence {i,1} = fluencee(:,:,4);

end
end
%save spatially_varying_epidermis_16_voxs_CUDA.mat spatially_varying_epidermis_16_voxs_CUDA
%% Geometry function(s) (see readme for details)
function M = geometryDefinition(X,Y,Z,parameters)
air_thick = 0.0175;
dermis_thick = 0.0375;
  
rng(0,'twister');
a = 0;
b = 100;
r = (b-a).*rand(16,1) + a;
r = r/10000;
Out_Epth = reshape(r, [4,4]);
media(1,:) = 2:5;
media(2,:) = 6:9;
media(3,:) = 10:13;
media(4,:) = 14:17;
x_res = 0.25;
y_res = 0.25;   

M = ones(size(X)); % Air
 
for m=1:4
    for n=1:4
        M((X>=(-0.5+(m-1)*x_res))&((-0.5+m*x_res)>X)   &   (Y>=-0.5+(n-1)*y_res)&((-0.5+n*y_res)>Y)   &   (Z>=air_thick)&((air_thick + Out_Epth(m,n))>Z)) = media(m,n); 
        M((X>=(-0.5+(m-1)*x_res))&((-0.5+m*x_res)>X)   &   (Y>=-0.5+(n-1)*y_res)&((-0.5+n*y_res)>Y)   &   (Z>=(air_thick + Out_Epth(m,n)))&((air_thick + Out_Epth(m,n)+dermis_thick)>Z)) = 18; %dermis
        M(Z > (air_thick + Out_Epth(m,n)+dermis_thick)) = 19; %fat
    end
end
end


%% Media Properties function (see readme for details)
function mediaProperties = mediaPropertiesFunc(parameters)
  disp("Inside")
  mediaProperties = MCmatlab.mediumProperties; 
  

  j=1;
  mediaProperties(j).name  = 'air';
  mediaProperties(j).mua   = 1e-8; % Absorption coefficient [cm^-1]
  mediaProperties(j).mus   = 1e-8; % Scattering coefficient [cm^-1]
  mediaProperties(j).g     = 1;    % Henyey-Greenstein scattering anisotropy
  
  
  mel = 0.01:0.005:0.1;
  hem = 0;
  water = 0.75;
  s = 0.75;
  fat = 0;
  
  for j=2:17
      mediaProperties(j).name  = ['vox', num2str(j-1)];
      mediaProperties(j).mua = @(wavelength) calc_mua(wavelength, s,hem,water,fat,mel(j-1));
      mediaProperties(j).mus = @(wavelength) calc_mus(wavelength, 40,0,1,0.9);
      mediaProperties(j).g   = 0.9;
      mediaProperties(j).VHC = 3391*1.109e-3; % [J cm^-3 K^-1]
      mediaProperties(j).TC  = 0.37e-2; % [W cm^-1 K^-1]
  end  
  
  
  j=18;
  mediaProperties(j).name = 'dermis';
  mediaProperties(j).mua = @func_mua18;
  function mua = func_mua18(wavelength)
    B = 0.002; % Blood content
    S = 0.67; % Blood oxygen saturation
    W = 0.65; % Water content
    M = 0; % Melanin content
    F = 0; % Fat content
    mua = calc_mua(wavelength,S,B,W,F,M); % Jacques "Optical properties of biological tissues: a review" eq. 12
  end
  mediaProperties(j).mus = @func_mus18;
  function mus = func_mus18(wavelength)
    aPrime = 42.4; % musPrime at 500 nm
    fRay = 0.62; % Fraction of scattering due to Rayleigh scattering
    bMie = 1; % Scattering power for Mie scattering
    g = 0.9; % Scattering anisotropy
    mus = calc_mus(wavelength,aPrime,fRay,bMie,g); % Jacques "Optical properties of biological tissues: a review" eq. 2
  end
  mediaProperties(j).g   = 0.9;
  mediaProperties(j).VHC = 3391*1.109e-3; % [J cm^-3 K^-1]
  mediaProperties(j).TC  = 0.37e-2; % [W cm^-1 K^-1]
  
  
  j=19;
  mediaProperties(j).name  = 'fat';
  mediaProperties(j).mua = @func_mua19;
  function mua = func_mua19(wavelength)
    B = 0;     % Blood content
    S = 0;  % Blood oxygen saturation
    W = 0.05;  % Water content
    M = 0; % Melanin content
    F = 0.95;     % Fat content
    mua = calc_mua(wavelength,S,B,W,F,M); % Jacques "Optical properties of biological tissues: a review" eq. 12
  end
  mediaProperties(j).mus = @func_mus19;
  function mus = func_mus19(wavelength)
    aPrime = 40; % musPrime at 500 nm
    fRay = 0.75;    % Fraction of scattering due to Rayleigh scattering
    bMie = 1;    % Scattering power for Mie scattering
    g = 0.9;     % Scattering anisotropy
    mus = calc_mus(wavelength,aPrime,fRay,bMie,g); % Jacques "Optical properties of biological tissues: a review" eq. 2
  end
  mediaProperties(j).g   = 0.9; 
end

 
function mus = calc_mus(wavelength,aPrime,fRay,bMie,g)
  fMie = 1 - fRay;
  musPrime= aPrime*(fRay*(wavelength/500)^(-4) + fMie*(wavelength/500)^(-bMie)); % Jacques "Optical properties of biological tissues: a review" eq. 2
  mus = musPrime/(1-g);
end



function mua = calc_mua(wavelength,S,B,W,F,M)
  persistent muadeoxy muafat muamel muaoxy muawater musp nmLIB
  if isempty(muadeoxy)
    load('spectralLIB.mat','muadeoxy','muafat','muamel','muaoxy','muawater','musp','nmLIB'); 
  end
  mua_deoxy = interp1(nmLIB,muadeoxy,wavelength);
  mua_fat = interp1(nmLIB,muafat,wavelength);
  mua_mel = interp1(nmLIB,muamel,wavelength);
  mua_oxy = interp1(nmLIB,muaoxy,wavelength);
  mua_water = interp1(nmLIB,muawater,wavelength);

  % Jacques "Optical properties of biological tissues: a review" eq. 12:
  mua = B*S*mua_oxy + B*(1-S)*mua_deoxy + W*mua_water + F*mua_fat + M*mua_mel; % Equation 12 without the bilirubin and beta-Carotene terms
end
```

Result: Took around 130 seconds.

#### With GPU

Lines 63 to 69 are swapped.

```
%CPU
%model.MC.useGPU                   = false; % (Default: false) Use CUDA acceleration for NVIDIA GPUs
% model = runMonteCarlo(model);

%GPU
model.MC.useGPU                   = true;  % (Default: false) Use CUDA acceleration for NVIDIA GPUs
model.MC.GPUdevice                = 0;     % (Default: 0, the first GPU) The index of the GPU device to use for the simulation
```

Result: Took around 180 seconds.

From the above tests, it appears that the GPU accelerated version is slower. I'd appreciate your insights or suggestions on what might be causing this.
Editor is loading...