-
Notifications
You must be signed in to change notification settings - Fork 28
Description
Hi,
I am having a strange problem that I am scratching my head about. I need to run a bunch of sdmTMB
models, so I decided to run them in parallel using foreach() %dopar%
. I do something along the lines of:
library(sdmTMB)
library(foreach)
library(doParallel)
cl <- makeCluster(4)
registerDoParallel(cl)
ss = foreach(formula=formulas,.packages=c("sdmTMB")) %dopar% {
m = sdmTMB(
data = mydata,
formula = formula,
mesh = mymesh,
time = 'year',
family = lognormal(),
spatial = "off",
spatiotemporal = "iid",
offset = log(mydata$num_pos_sets)
)
saveRDS(m,"myfile.RDS")
}
I have tested that the model runs fine outside of the %dopar%
, but when I run the full script, I see that the 4 R processes start up and are using heavy amounts of CPU and memory (often greatly exceeding 100% of CPU), but the models seem to never finish (a 15 minute model run individually never finishes over 12 hours). I have confirmed this both on my desktop computer and on a cluster I have access to. I have also confirmed that if I remove registerDoParallel
so that essentially %dopar%
becomes %do%
, then the code does slowly advance with a single R process using >600% CPU (note that I have not executed TMB::openmp
so if I understand correctly TMB itself should not be running in parallel).
Do you understand why this might be? Is it possible that the internal parallelization of sdmTMB
is not playing nicely with foreach
? What is the best way to parallelize the running of these models?
Thanks,
David