Wednesday, February 14, 2018

R nested foreach %dopar% in outer loop and %do% in inner loop

Leave a Comment

I'm running the following script in R. If I use a %do% rather than a %dopar% the script works fine. However, if in the outer loop I use a %dopar% the loop runs forever without throwing any error (constant increase in memory usage until it goes out of memory). I'm using 16 cores.

library(parallel) library(foreach) library(doSNOW) library(dplyr)   NumberOfCluster <- 16  cl <- makeCluster(NumberOfCluster)  registerDoSNOW(cl)    foreach(i = UNSPSC_list, .packages = c('data.table', 'dplyr'), .verbose = TRUE) %dopar%      {        terms <- as.data.table(unique(gsub(" ", "", unlist(terms_list_by_UNSPSC$Terms[which(substr(terms_list_by_UNSPSC$UNSPSC,1,6) == i)]))))        temp <- inner_join(N_of_UNSPSCs_by_Term, terms, on = 'V1')        temp$V2 <- 1/as.numeric(temp$V2)       temp <- temp[order(temp$V2, decreasing = TRUE),]       names(temp) <- c('Term','Imp')       ABNs <- unique(UNSPSCs_per_ABN[which(substr(UNSPSCs_per_ABN$UNSPSC,1,4) == substr(i,1,4)), 1])        predictions <- as.numeric(vector())        predictions <- foreach (j = seq(1 : nrow(train)), .combine = 'c', .packages = 'dplyr')  %do%        {          descr <- names(which(!is.na(train[j,]) == TRUE))          if(unlist(predict_all[j,1]) %in% unlist(ABNs) || !unlist(predict_all[j,1]) %in% unlist(suppliers)) {union_all(predictions, sum(temp$Imp[which(temp$Term %in% descr)]))} else {union_all(predictions, 0)}            }      save(predictions, file = paste("Predictions", i,".rda", sep = "_"))      } 

1 Answers

Answers 1

The proper way of nesting foreach loop is using %:% operator. See the example. I have tested it on Windows.

library(foreach) library(doSNOW)  NumberOfCluster <- 4 cl <- makeCluster(NumberOfCluster)  registerDoSNOW(cl)   N <- 1e6  system.time(foreach(i = 1:10, .combine = rbind) %:%               foreach(j = 1:10, .combine = c) %do% mean(rnorm(N, i, j)))  system.time(foreach(i = 1:10, .combine = rbind) %:%               foreach(j = 1:10, .combine = c) %dopar% mean(rnorm(N, i, j))) 

Output:

> system.time(foreach(i = 1:10, .combine = rbind) %:% +               foreach(j = 1:10, .combine = c) %do% mean(rnorm(N, i, j)))    user  system elapsed     7.38    0.23    7.64  > system.time(foreach(i = 1:10, .combine = rbind) %:% +               foreach(j = 1:10, .combine = c) %dopar% mean(rnorm(N, i, j)))    user  system elapsed     0.09    0.00    2.14  

CPU usage for %do% and %dopar%

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment