What's the common strategy to optimize c++ arithmetic computation for arrays? -

September 15, 2011

for example, have 3 float arrays, a, b , c, , want add a , b element-wisely c. naive way like

for(int = 0; < n; i++){     c[i] = a[i] + b[i]; }

as far know, openmp can parallelize piece of code. in opencv code, see flags cv_sse2 , cv_neon related optimization.

what's common way optimize these kinds of code, if want code highly efficient?

there no common strategy. should sure bottleneck (which might not be, if size n of arrays small enough).

some compilers able optimize (at least in simple cases) using vector machine instructions. gcc try compile gcc -o3 -mtune=native (or other -mtune=... or -mfpu=... arguments, in particular if cross-compiling) , possibly -ffast-math

you consider openmp, opencl (with gpgpu), openacc, mpi, explicit threading e.g. pthreads or c++11 std::thread-s, etc... (and clever mix of several approaches)

i leave optimization compiler, , consider improving if measure bottleneck. spend months or years (or specialize in whole work life) of developer time improve ....

you use numerical computation library (e.g. lapack, gsl, etc...) or specialized software scilab, octave, r, etc...

read http://floating-point-gui.de/

Search This Blog

Plus Code

What's the common strategy to optimize c++ arithmetic computation for arrays? -

Comments

Post a Comment

Popular posts from this blog

r - Trouble relying on third party package imports in my package -

java - Intellij IDEA shortcut How to add new element (ex. class or package)? -

Payment information shows nothing in one page checkout page magento -