-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
After running the PhyloNext command I got caught segfault
. Is this because I don't have enough memory? I'm on a laptop 32 GB Ram.
nextflow run vmikk/PhyloNext -r main \
--input "$HOME/Projects/phylonext/data_fr/occurrence.parquet/" \
--dbscan true \
--iterations 100 \
--outdir "$OUTPUTPHYLONEXT" \
-profile docker
N E X T F L O W ~ version 25.04.2
NOTE: Your local project version looks outdated - a different revision is available in the remote repository [626d28cfac]
Launching `https://github.com/vmikk/PhyloNext` [amazing_maxwell] DSL2 - revision: 3d8eae09fd [main]
====================================================================
PhyloNext: GBIF phylogenetic diversity pipeline, Version 1.4.2
====================================================================
GBIF occurrence dump: $HOME/Projects/phylonext/data_fr/occurrence.parquet/
Output path: $HOME/Projects/phylonext/analysis_2
Launch directory: $HOME/Projects/phylonext/analysis_2
Working directory: $HOME/Projects/phylonext/analysis_2/work
====================================================================
executor > local (2)
[79/5df96d] occ_filter | 0 of 1
[47/c97434] record_count | 0 of 1
[- ] outl_low -
[- ] outl_high -
[- ] prep_ott_ids -
[- ] get_ott_tree -
[- ] merge_occ -
[- ] prep_biodiv -
[- ] phylodiv -
[- ] rand_filelist -
[- ] aggregate_rnds_biodiv -
[- ] div_to_csv -
[- ] plot_pd -
[- ] plot_leaflet -
[- ] derived_datasets -
ERROR ~ Error executing process > 'record_count'
Caused by:
Process `record_count` terminated with an error exit status (139)
Command executed:
10_Record_counts.R --input occurrence.parquet --phylum NA --class NA --order NA --family NA --genus NA --country null --latmin null --latmax null --lonmin null --lonmax null --minyear 1945 --maxyear null --coordprecision 0.1 --coorduncertainty 10000 --coorduncertaintyexclude 301,3036,999,9999 --basisofrecordinclude null --basisofrecordexclude FOSSIL_SPECIMEN,LIVING_SPECIMEN --excludehuman true --terrestrial Land_Buffered_025_dgr.RData --roundcoords 2 --resolution 4 --threads 4 --rcode $(which "Shapefile_filters.R") --output "Record_counts"
Command exit status:
139
Command output:
Selected genera: NA
File with GBIF specieskeys: NA
Coordinate precision threshold: 0.1
Maximum allowed coordinate uncertainty: 10000
Black-listed values of coordinate uncertainty: 301,3036,999,9999
Country codes: NA
Minimum latitude: NA
Maximum latitude: NA
Minimum longitude: NA
Maximum longitude: NA
Basis of record to include: NA
Basis of record to exclude: FOSSIL_SPECIMEN,LIVING_SPECIMEN
Minimum year of occurrence: 1945
Maximum year of occurrence: NA
List of extict species: NA
Exclusion of human records: TRUE
Round coordinates: 2
Custom polygons: NA
WGSRPD data: NA
WGSRPD regions: NA
Terrestrial data: Land_Buffered_025_dgr.RData
Country and province centroids: NA
Capitals: NA
Institutions: NA
Uraban areas: NA
Spatial resolution: 4
Coordinate rounding: 2
Number of CPU threads to use: 4
Output prefix: Record_counts
Loading R packages...
arrow 14.0.0.2
data.table 1.14.10
dplyr 1.1.4
h3 3.7.2
sf 1.0.15
Number of available CPU threads: 22
Setting number of CPU threads to: 4
Loading Parquet data
General data filteing:
..Filtering by coordinate precision
..Filtering by coordinate uncertainty
..Filtering by coordinate uncertainty black-listed values
..Filtering by basis of record (exclusion only)
..Filtering by collection date (min year)
..Excluding human records (genus Homo)
..Rounding coordinates
..Column selection and unique record counts
..Collecting data
Command error:
Country and province centroids: NA
Capitals: NA
Institutions: NA
Uraban areas: NA
Spatial resolution: 4
Coordinate rounding: 2
Number of CPU threads to use: 4
Output prefix: Record_counts
Loading R packages...
arrow 14.0.0.2
data.table 1.14.10
dplyr 1.1.4
h3 3.7.2
sf 1.0.15
Number of available CPU threads: 22
Setting number of CPU threads to: 4
Loading Parquet data
General data filteing:
..Filtering by coordinate precision
..Filtering by coordinate uncertainty
..Filtering by coordinate uncertainty black-listed values
..Filtering by basis of record (exclusion only)
..Filtering by collection date (min year)
..Excluding human records (genus Homo)
..Rounding coordinates
..Column selection and unique record counts
..Collecting data
*** caught segfault ***
address 0x705e1d617278, cause 'memory not mapped'
Traceback:
1: Table__from_ExecPlanReader(self)
2: x$read_table()
3: as_arrow_table.RecordBatchReader(reader)
4: as_arrow_table(reader)
5: as_arrow_table.arrow_dplyr_query(x)
6: as_arrow_table(x)
7: doTryCatch(return(expr), name, parentenv, handler)
8: tryCatchOne(expr, names, parentenv, handlers[[1L]])
executor > local (2)
[79/5df96d] occ_filter | 0 of 1
[47/c97434] record_count | 0 of 1 ✘
[- ] outl_low -
[- ] outl_high -
[- ] prep_ott_ids -
[- ] get_ott_tree -
[- ] merge_occ -
[- ] prep_biodiv -
[- ] phylodiv -
[- ] rand_filelist -
[- ] aggregate_rnds_biodiv -
[- ] div_to_csv -
[- ] plot_pd -
[- ] plot_leaflet -
[- ] derived_datasets -
ERROR ~ Error executing process > 'record_count'
Caused by:
Process `record_count` terminated with an error exit status (139)
Command executed:
10_Record_counts.R --input occurrence.parquet --phylum NA --class NA --order NA --family NA --genus NA --country null --latmin null --latmax null --lonmin null --lonmax null --minyear 1945 --maxyear null --coordprecision 0.1 --coorduncertainty 10000 --coorduncertaintyexclude 301,3036,999,9999 --basisofrecordinclude null --basisofrecordexclude FOSSIL_SPECIMEN,LIVING_SPECIMEN --excludehuman true --terrestrial Land_Buffered_025_dgr.RData --roundcoords 2 --resolution 4 --threads 4 --rcode $(which "Shapefile_filters.R") --output "Record_counts"
Command exit status:
139
Command output:
Selected genera: NA
File with GBIF specieskeys: NA
Coordinate precision threshold: 0.1
Maximum allowed coordinate uncertainty: 10000
Black-listed values of coordinate uncertainty: 301,3036,999,9999
Country codes: NA
Minimum latitude: NA
Maximum latitude: NA
Minimum longitude: NA
Maximum longitude: NA
Basis of record to include: NA
Basis of record to exclude: FOSSIL_SPECIMEN,LIVING_SPECIMEN
Minimum year of occurrence: 1945
Maximum year of occurrence: NA
List of extict species: NA
Exclusion of human records: TRUE
Round coordinates: 2
Custom polygons: NA
WGSRPD data: NA
WGSRPD regions: NA
Terrestrial data: Land_Buffered_025_dgr.RData
Country and province centroids: NA
Capitals: NA
Institutions: NA
Uraban areas: NA
Spatial resolution: 4
Coordinate rounding: 2
Number of CPU threads to use: 4
Output prefix: Record_counts
Loading R packages...
arrow 14.0.0.2
data.table 1.14.10
dplyr 1.1.4
h3 3.7.2
sf 1.0.15
Number of available CPU threads: 22
Setting number of CPU threads to: 4
Loading Parquet data
General data filteing:
..Filtering by coordinate precision
..Filtering by coordinate uncertainty
..Filtering by coordinate uncertainty black-listed values
..Filtering by basis of record (exclusion only)
..Filtering by collection date (min year)
..Excluding human records (genus Homo)
..Rounding coordinates
..Column selection and unique record counts
..Collecting data
Command error:
Country and province centroids: NA
Capitals: NA
Institutions: NA
Uraban areas: NA
Spatial resolution: 4
Coordinate rounding: 2
Number of CPU threads to use: 4
Output prefix: Record_counts
Loading R packages...
arrow 14.0.0.2
data.table 1.14.10
dplyr 1.1.4
h3 3.7.2
sf 1.0.15
Number of available CPU threads: 22
Setting number of CPU threads to: 4
Loading Parquet data
General data filteing:
..Filtering by coordinate precision
..Filtering by coordinate uncertainty
..Filtering by coordinate uncertainty black-listed values
..Filtering by basis of record (exclusion only)
..Filtering by collection date (min year)
..Excluding human records (genus Homo)
..Rounding coordinates
..Column selection and unique record counts
..Collecting data
*** caught segfault ***
address 0x705e1d617278, cause 'memory not mapped'
Traceback:
1: Table__from_ExecPlanReader(self)
2: x$read_table()
3: as_arrow_table.RecordBatchReader(reader)
4: as_arrow_table(reader)
5: as_arrow_table.arrow_dplyr_query(x)
6: as_arrow_table(x)
7: doTryCatch(return(expr), name, parentenv, handler)
8: tryCatchOne(expr, names, parentenv, handlers[[1L]])
9: tryCatchList(expr, classes, parentenv, handlers)
10: tryCatch(as_arrow_table(x), error = function(e, call = caller_env(n = 4)) { augment_io_error_msg(e, call, schema = schema())})
executor > local (2)
[79/5df96d] occ_filter | 0 of 1
[47/c97434] record_count | 0 of 1 ✘
[- ] outl_low -
[- ] outl_high -
[- ] prep_ott_ids -
[- ] get_ott_tree -
[- ] merge_occ -
[- ] prep_biodiv -
[- ] phylodiv -
[- ] rand_filelist -
[- ] aggregate_rnds_biodiv -
[- ] div_to_csv -
[- ] plot_pd -
[- ] plot_leaflet -
[- ] derived_datasets -
Pipeline execution stopped with the following message: Country and province centroids: NA
Capitals: NA
Institutions: NA
Uraban areas: NA
Spatial resolution: 4
Coordinate rounding: 2
Number of CPU threads to use: 4
Output prefix: Record_counts
Loading R packages...
arrow 14.0.0.2
data.table 1.14.10
dplyr 1.1.4
h3 3.7.2
sf 1.0.15
Number of available CPU threads: 22
Setting number of CPU threads to: 4
Loading Parquet data
General data filteing:
..Filtering by coordinate precision
..Filtering by coordinate uncertainty
..Filtering by coordinate uncertainty black-listed values
..Filtering by basis of record (exclusion only)
..Filtering by collection date (min year)
..Excluding human records (genus Homo)
..Rounding coordinates
..Column selection and unique record counts
..Collecting data
*** caught segfault ***
address 0x705e1d617278, cause 'memory not mapped'
Traceback:
1: Table__from_ExecPlanReader(self)
2: x$read_table()
3: as_arrow_table.RecordBatchReader(reader)
4: as_arrow_table(reader)
5: as_arrow_table.arrow_dplyr_query(x)
6: as_arrow_table(x)
7: doTryCatch(return(expr), name, parentenv, handler)
8: tryCatchOne(expr, names, parentenv, handlers[[1L]])
9: tryCatchList(expr, classes, parentenv, handlers)
10: tryCatch(as_arrow_table(x), error = function(e, call = caller_env(n = 4)) { augment_io_error_msg(e, call, schema = schema())})
11: compute.arrow_dplyr_query(x)
12: collect.arrow_dplyr_query(.)
13: collect(.)
14: dsf %>% collect()
An irrecoverable exception occurred. R is aborting now ...
$HOME/Projects/phylonext/analysis_2/work/47/c9743407b72f607692a309b50a4d7c/.command.sh: line 2: 35 Segmentation fault (core dumped) 10_Record_counts.R --input occurrence.parquet --phylum NA --class NA --order NA --family NA --genus NA --country null --latmin null --latmax null --lonmin null --lonmax null --minyear 1945 --maxyear null --coordprecision 0.1 --coorduncertainty 10000 --coorduncertaintyexclude 301,3036,999,9999 --basisofrecordinclude null --basisofrecordexclude FOSSIL_SPECIMEN,LIVING_SPECIMEN --excludehuman true --terrestrial Land_Buffered_025_dgr.RData --roundcoords 2 --resolution 4 --threads 4 --rcode $(which "Shapefile_filters.R") --output "Record_counts"
ERROR ~ Error executing process > 'record_count'
Caused by:
Process `record_count` terminated with an error exit status (139)
Command executed:
10_Record_counts.R --input occurrence.parquet --phylum NA --class NA --order NA --family NA --genus NA --country null --latmin null --latmax null --lonmin null --lonmax null --minyear 1945 --maxyear null --coordprecision 0.1 --coorduncertainty 10000 --coorduncertaintyexclude 301,3036,999,9999 --basisofrecordinclude null --basisofrecordexclude FOSSIL_SPECIMEN,LIVING_SPECIMEN --excludehuman true --terrestrial Land_Buffered_025_dgr.RData --roundcoords 2 --resolution 4 --threads 4 --rcode $(which "Shapefile_filters.R") --output "Record_counts"
Command exit status:
139
Command output:
Selected genera: NA
File with GBIF specieskeys: NA
Coordinate precision threshold: 0.1
Maximum allowed coordinate uncertainty: 10000
Black-listed values of coordinate uncertainty: 301,3036,999,9999
Country codes: NA
Minimum latitude: NA
Maximum latitude: NA
Minimum longitude: NA
Maximum longitude: NA
Basis of record to include: NA
Basis of record to exclude: FOSSIL_SPECIMEN,LIVING_SPECIMEN
Minimum year of occurrence: 1945
Maximum year of occurrence: NA
List of extict species: NA
Exclusion of human records: TRUE
Round coordinates: 2
Custom polygons: NA
WGSRPD data: NA
WGSRPD regions: NA
Terrestrial data: Land_Buffered_025_dgr.RData
Country and province centroids: NA
Capitals: NA
Institutions: NA
Uraban areas: NA
Spatial resolution: 4
Coordinate rounding: 2
Number of CPU threads to use: 4
Output prefix: Record_counts
Loading R packages...
arrow 14.0.0.2
data.table 1.14.10
dplyr 1.1.4
h3 3.7.2
sf 1.0.15
Number of available CPU threads: 22
Setting number of CPU threads to: 4
Loading Parquet data
General data filteing:
..Filtering by coordinate precision
..Filtering by coordinate uncertainty
..Filtering by coordinate uncertainty black-listed values
..Filtering by basis of record (exclusion only)
..Filtering by collection date (min year)
..Excluding human records (genus Homo)
..Rounding coordinates
..Column selection and unique record counts
..Collecting data
Command error:
Country and province centroids: NA
Capitals: NA
Institutions: NA
Uraban areas: NA
Spatial resolution: 4
Coordinate rounding: 2
Number of CPU threads to use: 4
Output prefix: Record_counts
Loading R packages...
arrow 14.0.0.2
data.table 1.14.10
dplyr 1.1.4
h3 3.7.2
sf 1.0.15
Number of available CPU threads: 22
Setting number of CPU threads to: 4
Loading Parquet data
General data filteing:
..Filtering by coordinate precision
..Filtering by coordinate uncertainty
..Filtering by coordinate uncertainty black-listed values
..Filtering by basis of record (exclusion only)
..Filtering by collection date (min year)
..Excluding human records (genus Homo)
..Rounding coordinates
..Column selection and unique record counts
..Collecting data
*** caught segfault ***
address 0x705e1d617278, cause 'memory not mapped'
Traceback:
1: Table__from_ExecPlanReader(self)
2: x$read_table()
3: as_arrow_table.RecordBatchReader(reader)
4: as_arrow_table(reader)
5: as_arrow_table.arrow_dplyr_query(x)
6: as_arrow_table(x)
7: doTryCatch(return(expr), name, parentenv, handler)
8: tryCatchOne(expr, names, parentenv, handlers[[1L]])
9: tryCatchList(expr, classes, parentenv, handlers)
10: tryCatch(as_arrow_table(x), error = function(e, call = caller_env(n = 4)) { augment_io_error_msg(e, call, schema = schema())})
11: compute.arrow_dplyr_query(x)
12: collect.arrow_dplyr_query(.)
13: collect(.)
14: dsf %>% collect()
An irrecoverable exception occurred. R is aborting now ...
.command.sh: line 2: 35 Segmentation fault (core dumped) 10_Record_counts.R --input occurrence.parquet --phylum NA --class NA --order NA --family NA --genus NA --country null --latmin null --latmax null --lonmin null --lonmax null --minyear 1945 --maxyear null --coordprecision 0.1 --coorduncertainty 10000 --coorduncertaintyexclude 301,3036,999,9999 --basisofrecordinclude null --basisofrecordexclude FOSSIL_SPECIMEN,LIVING_SPECIMEN --excludehuman true --terrestrial Land_Buffered_025_dgr.RData --roundcoords 2 --resolution 4 --threads 4 --rcode $(which "Shapefile_filters.R") --output "Record_counts"
Work dir:
$HOME/Projects/phylonext/analysis_2/work/47/c9743407b72f607692a309b50a4d7c
Container:
vmikk/rarrow:1.4.0
Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
-- Check '.nextflow.log' file for details
Metadata
Metadata
Assignees
Labels
No labels