Untitled
unknown
plain_text
a year ago
13 kB
9
Indexable
Never
Hi Anders, Thank you for your swift response. I've made progress and managed to compile the C code and run the MATLAB code. Here's a detailed rundown of what I did: ### System & Software Configuration: - **MATLAB Version**: 2021b - **CUDA Version**: 11.7.0 - **NVIDIA Driver Version**: 515.43.04 - **GCC Version**: 10.3.0 ### Compilation of MCmatlab.c: ``` mex COPTIMFLAGS='$COPTIMFLAGS -Ofast -fopenmp -std=c11 -Wall' LDOPTIMFLAGS='$LDOPTIMFLAGS -Ofast -fopenmp -std=c11 -Wall' -outdir +MCmatlab/@model/private ./+MCmatlab/src/MCmatlab.c -lut mex COPTIMFLAGS='$COPTIMFLAGS -Ofast -std=c11 -Wall' LDOPTIMFLAGS='$LDOPTIMFLAGS -Ofast -std=c11 -Wall' -outdir +MCmatlab/@model/private -output MCmatlab_singlethreaded ./+MCmatlab/src/MCmatlab.c -lut ``` ### Compilation of finiteElementHeatPropagator.c: ``` mex COPTIMFLAGS='$COPTIMFLAGS -Ofast -fopenmp -std=c11 -Wall' LDOPTIMFLAGS='$LDOPTIMFLAGS -Ofast -fopenmp -std=c11 -Wall' -outdir +MCmatlab/@model/private ./+MCmatlab/src/finiteElementHeatPropagator.c -lut mex COPTIMFLAGS='$COPTIMFLAGS -Ofast -std=c11 -Wall' LDOPTIMFLAGS='$LDOPTIMFLAGS -Ofast -std=c11 -Wall' -outdir +MCmatlab/@model/private -output finiteElementHeatPropagator_singlethreaded ./+MCmatlab/src/finiteElementHeatPropagator.c -lut ``` ### Compilation of getDownsampledParamVals.c: ``` mex COPTIMFLAGS='$COPTIMFLAGS -Ofast -fopenmp -std=c11 -Wall' LDOPTIMFLAGS='$LDOPTIMFLAGS -Ofast -fopenmp -std=c11 -Wall' -outdir +MCmatlab/@model/private ./+MCmatlab/src/getDownsampledParamVals.c ``` ### Preparation for CUDA Compilation: ``` cp ./+MCmatlab/src/MCmatlab.c ./+MCmatlab/src/MCmatlab_CUDA.cu cp ./+MCmatlab/src/finiteElementHeatPropagator.c ./+MCmatlab/src/finiteElementHeatPropagator_CUDA.cu ``` ### Compilation Using mexcuda: Initially, using mexcuda directly in the command line did not work. So, I opened MATLAB via the command line and compiled with: ``` mexcuda COMPFLAGS='-use_fast_math -res-usage $COMPFLAGS' -outdir +MCmatlab/@model/private ./+MCmatlab/src/MCmatlab_CUDA.cu mexcuda COMPFLAGS='-use_fast_math -res-usage $COMPFLAGS' -outdir +MCmatlab/@model/private ./+MCmatlab/src/finiteElementHeatPropagator_CUDA.cu ``` Note: I had to remove the -llibut flag and commented out the utIsInterruptPending function in the .cu files. Since it's used to catch ctrl+c during MEX execution, I believe it's not crucial. ### Performance Testing: #### Without GPU: ``` function fluence = main_mc() %% Geometry definition MCmatlab.closeMCmatlabFigures(); model = MCmatlab.model; model.G.nx = 200; % Number of bins in the x direction model.G.ny = 200; % Number of bins in the y direction model.G.nz = 30; % Number of bins in the z direction %10 model.G.Lx = 1; % [cm] x size of simulation cuboid model.G.Ly = 1; % [cm] y size of simulation cuboid model.G.Lz = 0.15; % [cm] z size of simulation cuboid %0.1 model.G.mediaPropertiesFunc = @mediaPropertiesFunc; % Media properties defined as a function at the end of this file model.G.geomFunc = @geometryDefinition; % Function to use for defining the distribution of media in the cuboid. Defined at the end of this m file. %model = plot(model,'G'); % X = model.G.x; % Y = model.G.y; % Z = model.G.z; %% Top-hat focal plane, Gaussian angular intensity distribution model.MC.nPhotonsRequested = 4*5e6; % # of photons to launch % model.MC.nExamplePaths = 20; model.MC.matchedInterfaces = true; % Assumes all refractive indices are the same model.MC.boundaryType = 2; % 0: No escaping boundaries, 1: All cuboid boundaries are escaping, 2: Top cuboid boundary only is escaping, 3: Top and bottom boundaries are escaping, while the side boundaries are cyclic w = 400%:1:700; for i=1%:3%:length(w) model.MC.wavelength = w(i); model.MC.lightSource.xFocus = 0; % [cm] x position of focus model.MC.lightSource.yFocus = 0; % [cm] y position of focus model.MC.lightSource.zFocus = 0; % [cm] z position of focus model.MC.lightSource.sourceType = 4; % 0: Pencil beam, 1: Isotropically emitting line or point source, 2: Infinite plane wave, 3: Laguerre-Gaussian LG01 beam, 4: Radial-factorizable beam (e.g., a Gaussian beam), 5: X/Y factorizable beam (e.g., a rectangular LED emitter) model.MC.lightSource.focalPlaneIntensityDistribution.radialDistr = 0; % Radial focal plane intensity distribution - 0: Top-hat, 1: Gaussian, Array: Custom. Doesn't need to be normalized. model.MC.lightSource.focalPlaneIntensityDistribution.radialWidth = 1.2; % [cm] Radial focal plane 1/e^2 radius if top-hat or Gaussian or half-width of the full distribution if custom model.MC.lightSource.angularIntensityDistribution.radialDistr = 2; % Radial angular intensity distribution - 0: Top-hat, 1: Gaussian, 2: Cosine (Lambertian), Array: Custom. Doesn't need to be normalized. model.MC.lightSource.angularIntensityDistribution.radialWidth = 0; %pi/6; % [rad] Radial angular 1/e^2 half-angle if top-hat or Gaussian or half-angle of the full distribution if custom. For a diffraction limited Gaussian beam, this should be set to model.MC.wavelength*1e-9/(pi*model.MC.lightSource.focalPlaneIntensityDistribution.radialWidth*1e-2)) model.MC.lightSource.theta = 0; % [rad] Polar angle of beam center axis = 0; %pi/2; % [rad] Azimuthal angle of light source center axis % Collector model.MC.useLightCollector = true; model.MC.lightCollector.x = 0; %0; % [cm] x position of either the center of the objective lens focal plane or the fiber tip model.MC.lightCollector.y = 0; %-0.005; % [cm] y position model.MC.lightCollector.z = 0; % [cm] z position model.MC.lightCollector.theta = 0; % [rad] Polar angle of direction the light collector is facing model.MC.lightCollector.phi = 0; %pi/2; % [rad] Azimuthal angle of direction the light collector is facing model.MC.lightCollector.f = 0.0001; % [cm] Focal length of the objective lens (if light collector is a fiber, set this to Inf). model.MC.lightCollector.diam = sqrt(model.G.Lx^2 + model.G.Ly^2); % [cm] Diameter of the light collector aperture. For an ideal thin lens, this is 2*f*tan(asin(NA)). model.MC.lightCollector.fieldSize = sqrt(model.G.Lx^2 + model.G.Ly^2); % [cm] Field Size of the imaging system (diameter of area in object plane that gets imaged). Only used for finite f. model.MC.lightCollector.NA = 0.22; % [-] Fiber NA. Only used for infinite f. model.MC.lightCollector.res = 200; % X and Y resolution of light collector in pixels, only used for finite f model.MC.depositionCriteria.onlyCollected = true; %CPU model.MC.useGPU = false; % (Default: false) Use CUDA acceleration for NVIDIA GPUs % model = runMonteCarlo(model); %GPU %model.MC.useGPU = true; % (Default: false) Use CUDA acceleration for NVIDIA GPUs %model.MC.GPUdevice = 0; % (Default: 0, the first GPU) The index of the GPU device to use for the simulation tic model = runMonteCarlo(model); toc fluencee = model.MC.normalizedFluenceRate; fluence {i,1} = fluencee(:,:,4); end end %save spatially_varying_epidermis_16_voxs_CUDA.mat spatially_varying_epidermis_16_voxs_CUDA %% Geometry function(s) (see readme for details) function M = geometryDefinition(X,Y,Z,parameters) air_thick = 0.0175; dermis_thick = 0.0375; rng(0,'twister'); a = 0; b = 100; r = (b-a).*rand(16,1) + a; r = r/10000; Out_Epth = reshape(r, [4,4]); media(1,:) = 2:5; media(2,:) = 6:9; media(3,:) = 10:13; media(4,:) = 14:17; x_res = 0.25; y_res = 0.25; M = ones(size(X)); % Air for m=1:4 for n=1:4 M((X>=(-0.5+(m-1)*x_res))&((-0.5+m*x_res)>X) & (Y>=-0.5+(n-1)*y_res)&((-0.5+n*y_res)>Y) & (Z>=air_thick)&((air_thick + Out_Epth(m,n))>Z)) = media(m,n); M((X>=(-0.5+(m-1)*x_res))&((-0.5+m*x_res)>X) & (Y>=-0.5+(n-1)*y_res)&((-0.5+n*y_res)>Y) & (Z>=(air_thick + Out_Epth(m,n)))&((air_thick + Out_Epth(m,n)+dermis_thick)>Z)) = 18; %dermis M(Z > (air_thick + Out_Epth(m,n)+dermis_thick)) = 19; %fat end end end %% Media Properties function (see readme for details) function mediaProperties = mediaPropertiesFunc(parameters) disp("Inside") mediaProperties = MCmatlab.mediumProperties; j=1; mediaProperties(j).name = 'air'; mediaProperties(j).mua = 1e-8; % Absorption coefficient [cm^-1] mediaProperties(j).mus = 1e-8; % Scattering coefficient [cm^-1] mediaProperties(j).g = 1; % Henyey-Greenstein scattering anisotropy mel = 0.01:0.005:0.1; hem = 0; water = 0.75; s = 0.75; fat = 0; for j=2:17 mediaProperties(j).name = ['vox', num2str(j-1)]; mediaProperties(j).mua = @(wavelength) calc_mua(wavelength, s,hem,water,fat,mel(j-1)); mediaProperties(j).mus = @(wavelength) calc_mus(wavelength, 40,0,1,0.9); mediaProperties(j).g = 0.9; mediaProperties(j).VHC = 3391*1.109e-3; % [J cm^-3 K^-1] mediaProperties(j).TC = 0.37e-2; % [W cm^-1 K^-1] end j=18; mediaProperties(j).name = 'dermis'; mediaProperties(j).mua = @func_mua18; function mua = func_mua18(wavelength) B = 0.002; % Blood content S = 0.67; % Blood oxygen saturation W = 0.65; % Water content M = 0; % Melanin content F = 0; % Fat content mua = calc_mua(wavelength,S,B,W,F,M); % Jacques "Optical properties of biological tissues: a review" eq. 12 end mediaProperties(j).mus = @func_mus18; function mus = func_mus18(wavelength) aPrime = 42.4; % musPrime at 500 nm fRay = 0.62; % Fraction of scattering due to Rayleigh scattering bMie = 1; % Scattering power for Mie scattering g = 0.9; % Scattering anisotropy mus = calc_mus(wavelength,aPrime,fRay,bMie,g); % Jacques "Optical properties of biological tissues: a review" eq. 2 end mediaProperties(j).g = 0.9; mediaProperties(j).VHC = 3391*1.109e-3; % [J cm^-3 K^-1] mediaProperties(j).TC = 0.37e-2; % [W cm^-1 K^-1] j=19; mediaProperties(j).name = 'fat'; mediaProperties(j).mua = @func_mua19; function mua = func_mua19(wavelength) B = 0; % Blood content S = 0; % Blood oxygen saturation W = 0.05; % Water content M = 0; % Melanin content F = 0.95; % Fat content mua = calc_mua(wavelength,S,B,W,F,M); % Jacques "Optical properties of biological tissues: a review" eq. 12 end mediaProperties(j).mus = @func_mus19; function mus = func_mus19(wavelength) aPrime = 40; % musPrime at 500 nm fRay = 0.75; % Fraction of scattering due to Rayleigh scattering bMie = 1; % Scattering power for Mie scattering g = 0.9; % Scattering anisotropy mus = calc_mus(wavelength,aPrime,fRay,bMie,g); % Jacques "Optical properties of biological tissues: a review" eq. 2 end mediaProperties(j).g = 0.9; end function mus = calc_mus(wavelength,aPrime,fRay,bMie,g) fMie = 1 - fRay; musPrime= aPrime*(fRay*(wavelength/500)^(-4) + fMie*(wavelength/500)^(-bMie)); % Jacques "Optical properties of biological tissues: a review" eq. 2 mus = musPrime/(1-g); end function mua = calc_mua(wavelength,S,B,W,F,M) persistent muadeoxy muafat muamel muaoxy muawater musp nmLIB if isempty(muadeoxy) load('spectralLIB.mat','muadeoxy','muafat','muamel','muaoxy','muawater','musp','nmLIB'); end mua_deoxy = interp1(nmLIB,muadeoxy,wavelength); mua_fat = interp1(nmLIB,muafat,wavelength); mua_mel = interp1(nmLIB,muamel,wavelength); mua_oxy = interp1(nmLIB,muaoxy,wavelength); mua_water = interp1(nmLIB,muawater,wavelength); % Jacques "Optical properties of biological tissues: a review" eq. 12: mua = B*S*mua_oxy + B*(1-S)*mua_deoxy + W*mua_water + F*mua_fat + M*mua_mel; % Equation 12 without the bilirubin and beta-Carotene terms end ``` Result: Took around 130 seconds. #### With GPU Lines 63 to 69 are swapped. ``` %CPU %model.MC.useGPU = false; % (Default: false) Use CUDA acceleration for NVIDIA GPUs % model = runMonteCarlo(model); %GPU model.MC.useGPU = true; % (Default: false) Use CUDA acceleration for NVIDIA GPUs model.MC.GPUdevice = 0; % (Default: 0, the first GPU) The index of the GPU device to use for the simulation ``` Result: Took around 180 seconds. From the above tests, it appears that the GPU accelerated version is slower. I'd appreciate your insights or suggestions on what might be causing this.