Looking for a little advice on leveraging AsParallel()
or Parallel.ForEach()
to speed this up.
See the method I've got (simplified/bastardized for this example) below.
It takes a list like "US, FR, APAC", where "APAC" is an alias for maybe 50 other "US, FR, JP, IT, GB" etc. countires. The method should take "US, FR, APAC", and convert it to a list of "US", "FR", plus all the countries that are in "APAC".
private IEnumerable<string> Countries (string[] countriesAndAliases)
{
var countries = new List<string>();
foreach (var countryOrAlias in countriesAndAliases)
{
if (IsCountryNotAlias(countryOrAlias))
{
countries.Add(countryOrAlias);
}
else
{
foreach (var aliasCountry in AliasCountryLists[countryOrAlias])
{
countries.Add(aliasCountry);
}
}
}
return countries.Distinct();
}
Is making this parallelized as simple as changing it to what's below? Is there more nuance to using AsParallel()
than this? Should I be using Parallel.ForEach()
instead of foreach
? What rules of thumb should I use when parallelizing foreach
loops?
private IEnumerable<string> Countries (string[] countriesAndAliases)
{
var countries = new List<string>();
foreach (var countryOrAlias in countriesAndAliases.AsParallel())
{
if (IsCountryNotAlias(countryOrAlias))
{
countries.Add(countryOrAlias);
}
else
{
foreach (var aliasCountry in AliasCountryLists[countryOrAlias].AsParallel())
{
countries.Add(aliasCountry);
}
}
}
return countries.Distinct();
}
Several points.
writing just countriesAndAliases.AsParallel()
is useless. AsParallel()
makes part of Linq query that comes after it execute in parallel. Part is empty, so no use at all.
generally you should repace foreach
with Parallel.ForEach()
. But beware of not thread safe code! You have it. You can't just wrap it into foreach
because List<T>.Add
is not thread safe itself.
so you should do like this (sorry, i didn't test, but it compiles):
return countriesAndAliases
.AsParallel()
.SelectMany(s =>
IsCountryNotAlias(s)
? Enumerable.Repeat(s,1)
: AliasCountryLists[s]
).Distinct();
Edit:
You must be sure about two more things:
IsCountryNotAlias
must be thread safe. It would be even better if it is pure function.AliasCountryLists
in a meanwhile, because dictionaries are not thread safe. Or use ConcurrentDictionary to be sure.Useful links that will help you:
Parallel Programming in .NET 4 Coding Guidelines
When Should I Use Parallel.ForEach? When Should I Use PLINQ?
PS: As you see new parallel features are not as obvious as they look (and feel).