Guide into OpenMP: Easy multithreading programming for C++

The for construct splits the for-loop so that each thread in the current team handles a different portion of the loop.

#pragma omp for
 for(int n=0; n<10; ++n)
 {
   printf(" %d", n);
 }
 printf(".\n");

This loop will output each number from 0…9 once. However, it may do it in arbitrary order. It may output, for example:

0 5 6 7 1 8 2 3 4 9.

Internally, the above loop becomes into code equivalent to this:

int this_thread = omp_get_thread_num(), num_threads = omp_get_num_threads();
  int my_start = (this_thread  ) * 10 / num_threads;
  int my_end   = (this_thread+1) * 10 / num_threads;
  for(int n=my_start; n<my_end; ++n)
    printf(" %d", n);

So each thread gets a different section of the loop, and they execute their own sections in parallel.

Note: #pragma omp for only delegates portions of the loop for different threads in the current team. A team is the group of threads executing the program. At program start, the team consists only of a single member: the master thread that runs the program.

To create a new team of threads, you need to specify the parallel keyword. It can be specified in the surrounding context:

#pragma omp parallel
 {
  #pragma omp for
  for(int n=0; n<10; ++n) printf(" %d", n);
 }
 printf(".\n");

Equivalent shorthand is to specify it in the pragma itself, as #pragma omp parallel for:

#pragma omp parallel for
 for(int n=0; n<10; ++n) printf(" %d", n);
 printf(".\n");

You can explicitly specify the number of threads to be created in the team, using the num_threads attribute:

#pragma omp parallel num_threads(3)
 {
   // This code will be executed by three threads.

   // Chunks of this loop will be divided amongst
   // the (three) threads of the current team.

   #pragma omp for
   for(int n=0; n<10; ++n) printf(" %d", n);
 }

Note that OpenMP also works for C. However, in C, you need to set explicitly the loop variable as private, because C does not allow declaring it in the loop body:

int n;
 #pragma omp for private(n)
 for(n=0; n<10; ++n) printf(" %d", n);
 printf(".\n");

See the “private and shared clauses” section for details.

In OpenMP 2.5, the iteration variable in for must be a signed integer variable type. In OpenMP 3.0, it may also be an unsigned integer variable type, a pointer type or a constant-time random access iterator type. In the latter case, std::distance() will be used to determine the number of loop iterations.

The scheduling algorithm for the for-loop can explicitly controlled.

#pragma omp for schedule(static)
 for(int n=0; n<10; ++n) printf(" %d", n);
 printf(".\n");

There are five scheduling types: static, dynamic, guided, auto, and (since OpenMP 4.0) runtime. In addition, there are three scheduling modifiers (since OpenMP 4.5): monotonic, nonmonotonic, and simd.

static is the default schedule as shown above. Upon entering the loop, each thread independently decides which chunk of the loop they will process.

There is also the dynamic schedule:

#pragma omp for schedule(dynamic)
 for(int n=0; n<10; ++n) printf(" %d", n);
 printf(".\n");

In the dynamic schedule, there is no predictable order in which the loop items are assigned to different threads. Each thread asks the OpenMP runtime library for an iteration number, then handles it, then asks for next, and so on. This is most useful when used in conjunction with the ordered clause, or when the different iterations in the loop may take different time to execute.

The chunk size can also be specified to lessen the number of calls to the runtime library:

#pragma omp for schedule(dynamic, 3)
 for(int n=0; n<10; ++n) printf(" %d", n);
 printf(".\n");

In this example, each thread asks for an iteration number, executes 3 iterations of the loop, then asks for another, and so on. The last chunk may be smaller than 3, though.

Internally, the loop above becomes into code equivalent to this (illustration only, do not write code like this):

int a,b;
  if(GOMP_loop_dynamic_start(0,10,1, 3, &a,&b))
  {
    do {
      for(int n=a; n<b; ++n) printf(" %d", n);
    } while(GOMP_loop_dynamic_next(&a,&b));
  }

The guided schedule appears to have behavior of static with the shortcomings of static fixed with dynamic-like traits. It is difficult to explain —
this example program

maybe explains it better than words do. (Requires libSDL to compile.)

The “runtime” option means the runtime library chooses one of the scheduling options at runtime at the compiler library’s discretion.

A scheduling modifier can be added to the clause, e.g.: #pragma omp for schedule(nonmonotonic:dynamic
The modifiers are:

  • monotonic: Each thread executes chunks in an increasing iteration order.

  • nonmonotonic: Each thread executes chunks in an unspecified order.

  • simd: If the loop is a simd loop, this controls the chunk size for scheduling in a manner that is optimal for the hardware limitations according to how the compiler decides. This modifier is ignored for non-SIMD loops.

Original: https://www.cnblogs.com/mfryf/p/12744547.html
Author: 知识天地
Title: Guide into OpenMP: Easy multithreading programming for C++

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/535808/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

  • [c++] 拷贝构造函数

    拷贝构造函数就是进行对象拷贝复制的函数。 拷贝构造函数也是一种构造函数。它用同类型的对象来初始化新创建的对象。其唯一的形参是const类型&,此函数也由系统自动调用。 拷贝…

    C++ 2023年5月29日
    051
  • C/C++知识点记录

    1. strcpy 的部分理解char strcpy(char dest, const charsrc){while(dest++=src++)return dest-1;}这是s…

    C++ 2023年5月29日
    063
  • c++11并行、并发与多线程编程

    首先,我们先理解并发和并行的区别。 你吃饭吃到一半,电话来了,你一直到吃完了以后才去接,这就说明你不支持并发也不支持并行。你吃饭吃到一半,电话来了,你停了下来接了电话,接完后继续吃…

    C++ 2023年5月29日
    053
  • C++多线程库的常用函数积累和整理

    std::scoped_lock 待完成 标准库中 std::recursive_mutex提供这样的功能 一个互斥量可以在同一线程上多次上锁, 待完成 std::thread 类…

    C++ 2023年5月29日
    048
  • 【C++服务端技术】对象池

    代码没贴全,就少一个锁头文件,可以做设计参考 设计思想就是维护一个空闲链表,没有用的就重新申请,有的话就拿链表的头,使用完又还给空闲链表。 /* 一个分配固定大小内存的内存池,使用…

    C++ 2023年5月29日
    066
  • [转] C++ STL中map.erase(it++)用法原理解析

    总结一下map::erase的正确用法。首先看一下在循环中使用vector::erase时我习惯的用法: 这一种用法是没有问题的。 如上所示,C++98中map::erase并没有…

    C++ 2023年5月29日
    068
  • C++源码—_Ptr_base(MSVC 2017)

    1 _Ptr_base _Ptr_base 是智能指针的基类,它包含两个成员变量: 指向目标元素的指针 _Ptr 和 引用计数基类指针 _Rep。 _Ptr 指向的元素类型为 us…

    C++ 2023年5月29日
    066
  • C++ 回调函数(CallBack)的用法分析

    本文实例分析了C++中回调函数(CallBack)的用法。分享给大家供大家参考。具体分析如下: 如果试图直接使用C++的成员函数作为回调函数将发生错误,甚至编译就不能通过。其错误是…

    C++ 2023年5月29日
    048
  • 生成1~n之间随机整数_c++

    rand() % (high – low + 1) + low; Original: https://www.cnblogs.com/douzujun/p/16457919.htm…

    C++ 2023年5月29日
    038
  • c++ string 和wstring 之间的互相转换函数

    #include <string> std::string ws2s(const std::wstring& ws) { std::string curLoca…

    C++ 2023年5月29日
    056
  • c++builder调用VC的dll以及VC调用c++builder的dll

    解析__cdecl,__fastcall, __stdcall 的不同:在函数调用过程中,会使用堆栈,这三个表示不同的堆栈调用方式和释放方式。比如说__cdecl,它是标准的c方法…

    C++ 2023年5月29日
    066
  • [C++] 构造函数初始化列表

    C++ 类中构造函数中成员变量的初始化方式有两种: 1、构造函数体内(常用方式) 2、构造函数初始化列表 这两种方式,对于基本类型成员没有区别,但是对复杂类型成员(比如类,结构体等…

    C++ 2023年5月29日
    053
  • c++ 异常 warning: ‘MEMORY_UNIT_NAME’ defined but not used

    是开关的问题 , 将 #-g -O2 -pipe -W -Wall -Werror -fPIC -Wno-deprecated c++ 去掉。不检查。 Original: http…

    C++ 2023年5月29日
    069
  • C++内存管理

    [ 导语] 内存管理是C++最令人切齿痛恨的问题,也是C++最有争议的问题,C++高手从中获得了更好的性能,更大的自由,C++菜鸟的收获则是一遍一遍的检查代码和对C++的痛恨,但内…

    C++ 2023年5月29日
    048
  • c++ 条件变量

    http://blog.csdn.net/hemmanhui/article/details/4417433 互斥锁:用来上锁。 条件变量:用来等待,当条件变量用来自动阻塞一个线程…

    C++ 2023年5月29日
    043
  • !! vc C++ VS2017检查内存泄漏具体到某一行代码

    VLD工具可以用来检查VS C++程序的内存泄露。 VLD官网:https://kinddragon.github.io/vld/ 官网不方便下载的,可以用我的链接:https:/…

    C++ 2023年5月29日
    045
亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球