Skip to content

*** caught segfault *** address 0x705e1d617278, cause 'memory not mapped' #24

@MOB-Habitat

Description

@MOB-Habitat

After running the PhyloNext command I got caught segfault. Is this because I don't have enough memory? I'm on a laptop 32 GB Ram.

nextflow run vmikk/PhyloNext -r main \
--input "$HOME/Projects/phylonext/data_fr/occurrence.parquet/" \
--dbscan true  \
--iterations 100  \
--outdir "$OUTPUTPHYLONEXT" \
-profile docker

N E X T F L O W   ~  version 25.04.2

NOTE: Your local project version looks outdated - a different revision is available in the remote repository [626d28cfac]
Launching `https://github.com/vmikk/PhyloNext` [amazing_maxwell] DSL2 - revision: 3d8eae09fd [main]


====================================================================
  PhyloNext: GBIF phylogenetic diversity pipeline, Version 1.4.2
====================================================================
  GBIF occurrence dump:     $HOME/Projects/phylonext/data_fr/occurrence.parquet/
  Output path:              $HOME/Projects/phylonext/analysis_2
Launch directory:         $HOME/Projects/phylonext/analysis_2
Working directory:        $HOME/Projects/phylonext/analysis_2/work

====================================================================
  
  
  executor >  local (2)
[79/5df96d] occ_filter            | 0 of 1
[47/c97434] record_count          | 0 of 1
[-        ] outl_low              -
  [-        ] outl_high             -
  [-        ] prep_ott_ids          -
  [-        ] get_ott_tree          -
  [-        ] merge_occ             -
  [-        ] prep_biodiv           -
  [-        ] phylodiv              -
  [-        ] rand_filelist         -
  [-        ] aggregate_rnds_biodiv -
  [-        ] div_to_csv            -
  [-        ] plot_pd               -
  [-        ] plot_leaflet          -
  [-        ] derived_datasets      -
  ERROR ~ Error executing process > 'record_count'

Caused by:
  Process `record_count` terminated with an error exit status (139)


Command executed:
  
  10_Record_counts.R       --input   occurrence.parquet       --phylum  NA       --class   NA       --order   NA       --family  NA       --genus   NA       --country null       --latmin  null       --latmax  null       --lonmin  null       --lonmax  null       --minyear 1945       --maxyear null       --coordprecision          0.1       --coorduncertainty        10000       --coorduncertaintyexclude 301,3036,999,9999       --basisofrecordinclude null       --basisofrecordexclude FOSSIL_SPECIMEN,LIVING_SPECIMEN                     --excludehuman true                     --terrestrial Land_Buffered_025_dgr.RData                                   --roundcoords 2       --resolution  4       --threads     4       --rcode       $(which "Shapefile_filters.R")       --output      "Record_counts"

Command exit status:
  139

Command output:
  Selected genera: NA
File with GBIF specieskeys: NA
Coordinate precision threshold: 0.1
Maximum allowed coordinate uncertainty: 10000
Black-listed values of coordinate uncertainty: 301,3036,999,9999
Country codes: NA
Minimum latitude: NA
Maximum latitude: NA
Minimum longitude: NA
Maximum longitude: NA
Basis of record to include: NA
Basis of record to exclude: FOSSIL_SPECIMEN,LIVING_SPECIMEN
Minimum year of occurrence: 1945
Maximum year of occurrence: NA
List of extict species: NA
Exclusion of human records: TRUE
Round coordinates: 2
Custom polygons: NA
WGSRPD data: NA
WGSRPD regions: NA
Terrestrial data: Land_Buffered_025_dgr.RData
Country and province centroids: NA
Capitals: NA
Institutions: NA
Uraban areas: NA
Spatial resolution: 4
Coordinate rounding: 2
Number of CPU threads to use: 4
Output prefix: Record_counts

Loading R packages...
arrow 14.0.0.2
data.table 1.14.10
dplyr 1.1.4
h3 3.7.2
sf 1.0.15

Number of available CPU threads:  22
Setting number of CPU threads to:  4
Loading Parquet data
General data filteing:
  ..Filtering by coordinate precision
..Filtering by coordinate uncertainty
..Filtering by coordinate uncertainty black-listed values
..Filtering by basis of record (exclusion only)
..Filtering by collection date (min year)
..Excluding human records (genus Homo)
..Rounding coordinates
..Column selection and unique record counts
..Collecting data

Command error:
  Country and province centroids: NA
Capitals: NA
Institutions: NA
Uraban areas: NA
Spatial resolution: 4
Coordinate rounding: 2
Number of CPU threads to use: 4
Output prefix: Record_counts

Loading R packages...
arrow 14.0.0.2
data.table 1.14.10
dplyr 1.1.4
h3 3.7.2
sf 1.0.15

Number of available CPU threads:  22
Setting number of CPU threads to:  4
Loading Parquet data
General data filteing:
  ..Filtering by coordinate precision
..Filtering by coordinate uncertainty
..Filtering by coordinate uncertainty black-listed values
..Filtering by basis of record (exclusion only)
..Filtering by collection date (min year)
..Excluding human records (genus Homo)
..Rounding coordinates
..Column selection and unique record counts
..Collecting data

*** caught segfault ***
  address 0x705e1d617278, cause 'memory not mapped'

Traceback:
  1: Table__from_ExecPlanReader(self)
2: x$read_table()
3: as_arrow_table.RecordBatchReader(reader)
4: as_arrow_table(reader)
5: as_arrow_table.arrow_dplyr_query(x)
6: as_arrow_table(x)
7: doTryCatch(return(expr), name, parentenv, handler)
8: tryCatchOne(expr, names, parentenv, handlers[[1L]])
executor >  local (2)
[79/5df96d] occ_filter            | 0 of 1
[47/c97434] record_count          | 0 of 1 ✘
[-        ] outl_low              -
  [-        ] outl_high             -
  [-        ] prep_ott_ids          -
  [-        ] get_ott_tree          -
  [-        ] merge_occ             -
  [-        ] prep_biodiv           -
  [-        ] phylodiv              -
  [-        ] rand_filelist         -
  [-        ] aggregate_rnds_biodiv -
  [-        ] div_to_csv            -
  [-        ] plot_pd               -
  [-        ] plot_leaflet          -
  [-        ] derived_datasets      -
  ERROR ~ Error executing process > 'record_count'

Caused by:
  Process `record_count` terminated with an error exit status (139)


Command executed:
  
  10_Record_counts.R       --input   occurrence.parquet       --phylum  NA       --class   NA       --order   NA       --family  NA       --genus   NA       --country null       --latmin  null       --latmax  null       --lonmin  null       --lonmax  null       --minyear 1945       --maxyear null       --coordprecision          0.1       --coorduncertainty        10000       --coorduncertaintyexclude 301,3036,999,9999       --basisofrecordinclude null       --basisofrecordexclude FOSSIL_SPECIMEN,LIVING_SPECIMEN                     --excludehuman true                     --terrestrial Land_Buffered_025_dgr.RData                                   --roundcoords 2       --resolution  4       --threads     4       --rcode       $(which "Shapefile_filters.R")       --output      "Record_counts"

Command exit status:
  139

Command output:
  Selected genera: NA
File with GBIF specieskeys: NA
Coordinate precision threshold: 0.1
Maximum allowed coordinate uncertainty: 10000
Black-listed values of coordinate uncertainty: 301,3036,999,9999
Country codes: NA
Minimum latitude: NA
Maximum latitude: NA
Minimum longitude: NA
Maximum longitude: NA
Basis of record to include: NA
Basis of record to exclude: FOSSIL_SPECIMEN,LIVING_SPECIMEN
Minimum year of occurrence: 1945
Maximum year of occurrence: NA
List of extict species: NA
Exclusion of human records: TRUE
Round coordinates: 2
Custom polygons: NA
WGSRPD data: NA
WGSRPD regions: NA
Terrestrial data: Land_Buffered_025_dgr.RData
Country and province centroids: NA
Capitals: NA
Institutions: NA
Uraban areas: NA
Spatial resolution: 4
Coordinate rounding: 2
Number of CPU threads to use: 4
Output prefix: Record_counts

Loading R packages...
arrow 14.0.0.2
data.table 1.14.10
dplyr 1.1.4
h3 3.7.2
sf 1.0.15

Number of available CPU threads:  22
Setting number of CPU threads to:  4
Loading Parquet data
General data filteing:
  ..Filtering by coordinate precision
..Filtering by coordinate uncertainty
..Filtering by coordinate uncertainty black-listed values
..Filtering by basis of record (exclusion only)
..Filtering by collection date (min year)
..Excluding human records (genus Homo)
..Rounding coordinates
..Column selection and unique record counts
..Collecting data

Command error:
  Country and province centroids: NA
Capitals: NA
Institutions: NA
Uraban areas: NA
Spatial resolution: 4
Coordinate rounding: 2
Number of CPU threads to use: 4
Output prefix: Record_counts

Loading R packages...
arrow 14.0.0.2
data.table 1.14.10
dplyr 1.1.4
h3 3.7.2
sf 1.0.15

Number of available CPU threads:  22
Setting number of CPU threads to:  4
Loading Parquet data
General data filteing:
  ..Filtering by coordinate precision
..Filtering by coordinate uncertainty
..Filtering by coordinate uncertainty black-listed values
..Filtering by basis of record (exclusion only)
..Filtering by collection date (min year)
..Excluding human records (genus Homo)
..Rounding coordinates
..Column selection and unique record counts
..Collecting data

*** caught segfault ***
  address 0x705e1d617278, cause 'memory not mapped'

Traceback:
  1: Table__from_ExecPlanReader(self)
2: x$read_table()
3: as_arrow_table.RecordBatchReader(reader)
4: as_arrow_table(reader)
5: as_arrow_table.arrow_dplyr_query(x)
6: as_arrow_table(x)
7: doTryCatch(return(expr), name, parentenv, handler)
8: tryCatchOne(expr, names, parentenv, handlers[[1L]])
9: tryCatchList(expr, classes, parentenv, handlers)
10: tryCatch(as_arrow_table(x), error = function(e, call = caller_env(n = 4)) {    augment_io_error_msg(e, call, schema = schema())})
executor >  local (2)
[79/5df96d] occ_filter            | 0 of 1
[47/c97434] record_count          | 0 of 1 ✘
[-        ] outl_low              -
  [-        ] outl_high             -
  [-        ] prep_ott_ids          -
  [-        ] get_ott_tree          -
  [-        ] merge_occ             -
  [-        ] prep_biodiv           -
  [-        ] phylodiv              -
  [-        ] rand_filelist         -
  [-        ] aggregate_rnds_biodiv -
  [-        ] div_to_csv            -
  [-        ] plot_pd               -
  [-        ] plot_leaflet          -
  [-        ] derived_datasets      -
  Pipeline execution stopped with the following message: Country and province centroids: NA
Capitals: NA
Institutions: NA
Uraban areas: NA
Spatial resolution: 4
Coordinate rounding: 2
Number of CPU threads to use: 4
Output prefix: Record_counts

Loading R packages...
arrow 14.0.0.2
data.table 1.14.10
dplyr 1.1.4
h3 3.7.2
sf 1.0.15

Number of available CPU threads:  22
Setting number of CPU threads to:  4
Loading Parquet data
General data filteing:
  ..Filtering by coordinate precision
..Filtering by coordinate uncertainty
..Filtering by coordinate uncertainty black-listed values
..Filtering by basis of record (exclusion only)
..Filtering by collection date (min year)
..Excluding human records (genus Homo)
..Rounding coordinates
..Column selection and unique record counts
..Collecting data

*** caught segfault ***
  address 0x705e1d617278, cause 'memory not mapped'

Traceback:
  1: Table__from_ExecPlanReader(self)
2: x$read_table()
3: as_arrow_table.RecordBatchReader(reader)
4: as_arrow_table(reader)
5: as_arrow_table.arrow_dplyr_query(x)
6: as_arrow_table(x)
7: doTryCatch(return(expr), name, parentenv, handler)
8: tryCatchOne(expr, names, parentenv, handlers[[1L]])
9: tryCatchList(expr, classes, parentenv, handlers)
10: tryCatch(as_arrow_table(x), error = function(e, call = caller_env(n = 4)) {    augment_io_error_msg(e, call, schema = schema())})
11: compute.arrow_dplyr_query(x)
12: collect.arrow_dplyr_query(.)
13: collect(.)
14: dsf %>% collect()
An irrecoverable exception occurred. R is aborting now ...
$HOME/Projects/phylonext/analysis_2/work/47/c9743407b72f607692a309b50a4d7c/.command.sh: line 2:    35 Segmentation fault      (core dumped) 10_Record_counts.R --input occurrence.parquet --phylum NA --class NA --order NA --family NA --genus NA --country null --latmin null --latmax null --lonmin null --lonmax null --minyear 1945 --maxyear null --coordprecision 0.1 --coorduncertainty 10000 --coorduncertaintyexclude 301,3036,999,9999 --basisofrecordinclude null --basisofrecordexclude FOSSIL_SPECIMEN,LIVING_SPECIMEN --excludehuman true --terrestrial Land_Buffered_025_dgr.RData --roundcoords 2 --resolution 4 --threads 4 --rcode $(which "Shapefile_filters.R") --output "Record_counts"
ERROR ~ Error executing process > 'record_count'

Caused by:
  Process `record_count` terminated with an error exit status (139)


Command executed:
  
  10_Record_counts.R       --input   occurrence.parquet       --phylum  NA       --class   NA       --order   NA       --family  NA       --genus   NA       --country null       --latmin  null       --latmax  null       --lonmin  null       --lonmax  null       --minyear 1945       --maxyear null       --coordprecision          0.1       --coorduncertainty        10000       --coorduncertaintyexclude 301,3036,999,9999       --basisofrecordinclude null       --basisofrecordexclude FOSSIL_SPECIMEN,LIVING_SPECIMEN                     --excludehuman true                     --terrestrial Land_Buffered_025_dgr.RData                                   --roundcoords 2       --resolution  4       --threads     4       --rcode       $(which "Shapefile_filters.R")       --output      "Record_counts"

Command exit status:
  139

Command output:
  Selected genera: NA
File with GBIF specieskeys: NA
Coordinate precision threshold: 0.1
Maximum allowed coordinate uncertainty: 10000
Black-listed values of coordinate uncertainty: 301,3036,999,9999
Country codes: NA
Minimum latitude: NA
Maximum latitude: NA
Minimum longitude: NA
Maximum longitude: NA
Basis of record to include: NA
Basis of record to exclude: FOSSIL_SPECIMEN,LIVING_SPECIMEN
Minimum year of occurrence: 1945
Maximum year of occurrence: NA
List of extict species: NA
Exclusion of human records: TRUE
Round coordinates: 2
Custom polygons: NA
WGSRPD data: NA
WGSRPD regions: NA
Terrestrial data: Land_Buffered_025_dgr.RData
Country and province centroids: NA
Capitals: NA
Institutions: NA
Uraban areas: NA
Spatial resolution: 4
Coordinate rounding: 2
Number of CPU threads to use: 4
Output prefix: Record_counts

Loading R packages...
arrow 14.0.0.2
data.table 1.14.10
dplyr 1.1.4
h3 3.7.2
sf 1.0.15

Number of available CPU threads:  22
Setting number of CPU threads to:  4
Loading Parquet data
General data filteing:
  ..Filtering by coordinate precision
..Filtering by coordinate uncertainty
..Filtering by coordinate uncertainty black-listed values
..Filtering by basis of record (exclusion only)
..Filtering by collection date (min year)
..Excluding human records (genus Homo)
..Rounding coordinates
..Column selection and unique record counts
..Collecting data

Command error:
  Country and province centroids: NA
Capitals: NA
Institutions: NA
Uraban areas: NA
Spatial resolution: 4
Coordinate rounding: 2
Number of CPU threads to use: 4
Output prefix: Record_counts

Loading R packages...
arrow 14.0.0.2
data.table 1.14.10
dplyr 1.1.4
h3 3.7.2
sf 1.0.15

Number of available CPU threads:  22
Setting number of CPU threads to:  4
Loading Parquet data
General data filteing:
  ..Filtering by coordinate precision
..Filtering by coordinate uncertainty
..Filtering by coordinate uncertainty black-listed values
..Filtering by basis of record (exclusion only)
..Filtering by collection date (min year)
..Excluding human records (genus Homo)
..Rounding coordinates
..Column selection and unique record counts
..Collecting data

*** caught segfault ***
  address 0x705e1d617278, cause 'memory not mapped'

Traceback:
  1: Table__from_ExecPlanReader(self)
2: x$read_table()
3: as_arrow_table.RecordBatchReader(reader)
4: as_arrow_table(reader)
5: as_arrow_table.arrow_dplyr_query(x)
6: as_arrow_table(x)
7: doTryCatch(return(expr), name, parentenv, handler)
8: tryCatchOne(expr, names, parentenv, handlers[[1L]])
9: tryCatchList(expr, classes, parentenv, handlers)
10: tryCatch(as_arrow_table(x), error = function(e, call = caller_env(n = 4)) {    augment_io_error_msg(e, call, schema = schema())})
11: compute.arrow_dplyr_query(x)
12: collect.arrow_dplyr_query(.)
13: collect(.)
14: dsf %>% collect()
An irrecoverable exception occurred. R is aborting now ...
.command.sh: line 2:    35 Segmentation fault      (core dumped) 10_Record_counts.R --input occurrence.parquet --phylum NA --class NA --order NA --family NA --genus NA --country null --latmin null --latmax null --lonmin null --lonmax null --minyear 1945 --maxyear null --coordprecision 0.1 --coorduncertainty 10000 --coorduncertaintyexclude 301,3036,999,9999 --basisofrecordinclude null --basisofrecordexclude FOSSIL_SPECIMEN,LIVING_SPECIMEN --excludehuman true --terrestrial Land_Buffered_025_dgr.RData --roundcoords 2 --resolution 4 --threads 4 --rcode $(which "Shapefile_filters.R") --output "Record_counts"

Work dir:
  $HOME/Projects/phylonext/analysis_2/work/47/c9743407b72f607692a309b50a4d7c

Container:
  vmikk/rarrow:1.4.0

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

-- Check '.nextflow.log' file for details

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions