Stack Overflow Asked by matthias_buehlmann on January 13, 2021
I can do
std::vector<int> a;
a.reserve(1000);
for(int i=0; i<1000; i++)
a.push_back(i);
std::for_each(std::execution::par_unseq, std::begin(a), std::end(a), [&](int i) {
... do something based on i ...
});
but is there a more elegant way of creating a parallelized version of for(int i=0; i<n; i++) that does not require me to first fill a vector with ascending ints?
Here are two ways to do it without pre-populating a vector just to store a sequence of integers.
You can do it with Boost.counting_range
(or directly using Boost.counting_iterator
as you prefer) ... although good luck finding out how from reading the documentation.
auto range = boost::counting_range<int>(0,1000);
std::for_each(std::execution::par_unseq,
range.begin(),
range.end(),
[&](int i) {
// ... do something based on i ...
});
If you don't want to include Boost, we can write a simple version directly.
With no apology for munging iota
and iterator
together instead of coming up with a decent name, the below will let you write something similar to the Boost version above:
std::for_each(std::execution::par_unseq,
ioterable<int>(0),
ioterable<int>(1000),
[&](int i) {
// ... do something based on i ...
}
);
You can see how much boilerplate you save by using Boost for this:
template <typename NumericType>
struct ioterable
{
using iterator_category = std::input_iterator_tag;
using value_type = NumericType;
using difference_type = NumericType;
using pointer = std::add_pointer_t<NumericType>;
using reference = NumericType;
explicit ioterable(NumericType n) : val_(n) {}
ioterable() = default;
ioterable(ioterable&&) = default;
ioterable(ioterable const&) = default;
ioterable& operator=(ioterable&&) = default;
ioterable& operator=(ioterable const&) = default;
ioterable& operator++() { ++val_; return *this; }
ioterable operator++(int) { ioterable tmp(*this); ++val_; return tmp; }
bool operator==(ioterable const& other) const { return val_ == other.val_; }
bool operator!=(ioterable const& other) const { return val_ != other.val_; }
value_type operator*() const { return val_; }
private:
NumericType val_{ std::numeric_limits<NumericType>::max() };
};
Answered by Useless on January 13, 2021
Although I can't suggest a way to avoid filling a vector, I can recommend using the std::iota()
function as (perhaps) the most efficient/elegant way to fill it with incrementing integers:
std::vector<int> a(1000);
std::iota(std::begin(a), std::end(a), 0);
std::for_each(std::execution::par_unseq, std::begin(a), std::end(a), [&](int i) {
// ... do something based on i ...
});
The complexity of std::iota
is exactly last - first
increments and assignments, whereas the std::generate
function has a complexity of last - first
invocations of g()
and assignments. Even if a decent compiler were to inline a simple increment lambda function for g
, the iota
syntax is considerably simpler, IMHO.
Answered by Adrian Mole on January 13, 2021
VisualC++ provides a rich parallel programming enviromnent, concurrency runtime ConCRT.
You can use OpenMP, which is open standard but also available in ConCRT. As described on wikipedia it is embarrassingly parallel, following code is supposed to create 1000 threads:
#include <omp.h>
...
#pragma omp parallel for
for(int s = 0; s < 1000; s++)
{
for(int i = 0; i < s; i++)
... do something parallel based on i ...
}
The #pragma omp directives are ignored if compiler option /openmp is not specified.
In fact I don't understand the role of your vector, so I omitted it. Also I don't understand the reasoning behind the replacing of the standard for with any for_each and work with saved indexes, since for loop does it pretty well.
Or you can use Microsoft specific library PPL. Following code also creates 1000 threads, generating indexes from 0 to 999 inclusive and passing to parallel routine as s variable:
#include <ppl.h>
...
using namespace concurrency;
parallel_for(0, 1000, [&](int s)
{
for(int i = 0; i < s; i++)
... do something parallel based on i ...
});
For heavy parallel computations there is also AMP available in concurrency runtime. AMP does the parallel routines on GPU instead of CPU.
Answered by armagedescu on January 13, 2021
You could use std::generate
to create a vector {0, 1, ..., 999}
std::vector<int> v(1000);
std::generate(v.begin(), v.end(), [n = 0] () mutable { return n++; });
There is an overload that accepts an ExecutionPolicy
so you could modify the above to
std::vector<int> v(1000);
std::generate(std::execution::par, v.begin(), v.end(), [n = 0] () mutable { return n++; });
Answered by Cory Kramer on January 13, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP