Skip to content

Feedback on notebook 'Demographics on Lebanon' #99

@abrilr1604

Description

@abrilr1604

Hi! I've reviewing and running the first notebook, 'Demographics of Lebanon', here's some feedback:

  1. I get an error after running this code:
hrsl_lebanon0 = LEBANON_ADM0.sjoin(hrsl)
hrsl_lebanon0 = hrsl_lebanon0.groupby(["admin0Name"]).sum()
hrsl_lebanon3 = LEBANON_ADM3.sjoin(hrsl)

Image

The issue happens because hrsl_lebanon0 has two datetime columns:

date datetime64[ms]
validOn datetime64[ms]

But it seems that hrsl_lebanon0 isn’t used after that. Is the notebook going to be updated? If not, it might make sense to either remove or comment out that line to avoid errors for other people running the notebook.

  1. I also get the same error after running this:
hrsl_lebanon3 = (
    hrsl_lebanon3.groupby(["admin0Name", "admin1Name", "admin2Name", "admin3Name"])
    .sum()
    .reset_index()[["admin0Name", "admin1Name", "admin3Name", "population"]]
)
hrsl_lebanon3 = hrsl_lebanon3.merge(
    LEBANON_ADM3[["admin0Name", "admin1Name", "admin3Name", "geometry"]],
    on=["admin0Name", "admin1Name", "admin3Name"],
)

hrsl_lebanon3 = gpd.GeoDataFrame(hrsl_lebanon3, crs="EPSG:4326", geometry="geometry")

Image

This happens because there are non-numeric columns (like geometry and datetime) that cannot be summed.

I fixed it by excluding those data types:

hrsl_lebanon3 = (
    hrsl_lebanon3.select_dtypes(exclude=["datetime", "geometry"])  # Exclude datetime and geometry columns
    .groupby(["admin0Name", "admin1Name", "admin2Name", "admin3Name"])
    .sum()
    .reset_index()[["admin0Name", "admin1Name", "admin3Name", "population"]]
)
  1. The lebanon_bbox variable is declared but not used. We could either changed the code to:
hrsl = hrsl[
    (hrsl["longitude"] > lebanon_bbox[0])  
    & (hrsl["longitude"] < lebanon_bbox[2])  
    & (hrsl["latitude"] > lebanon_bbox[1]) 
    & (hrsl["latitude"] < lebanon_bbox[3])  
]

or just delete the variable if it’s not required.

  1. I recommend adding a semicolon ; at the end of the cells that create maps to suppress unnecessary messages from appearing above them:
    Image

  2. It would be helpful to simplify this part by hiding the urls and adding hyperlinks.

Population data: Population data was used by the [estimates released by UNOCHA in 2002](https://cod.humanitarianresponse.info/fr/search/field_country_region/138, Accessed 05/01/2014). The lowest admin level at which data was available is admin 3.

Sex variables: Sex variables were derived from UN Urban and Rural Population by Age and Sex (URPAS) 2010 estimates, generated using URPAS, http://esa.un.org/unpd/popdev/urpas/urpas2014.aspx, Accessed 03/27/2015 at a national level.

Age variables: Age variables were obtained from UN Urban and Rural Population by Age and Sex (URPAS) 2010 estimates, generated using URPAS, http://esa.un.org/unpd/popdev/urpas/urpas2014.aspx, Accessed 03/27/2015 at a national level.

Boundaries:Boundaries were obtained from UN Office for the Coordination of Humanitarian Affairs (UN OCHA), http://www.humanitarianresponse.info/applications/data, Accessed 01/05/2011.

For example change this: Population data: Population data was used by the estimates released by UNOCHA in 2002 (https://cod.humanitarianresponse.info/fr/search/field_country_region/138, Accessed 05/01/2014).

to this: Population data: Population data was used by the estimates, released by UNOCHA in 2002 and accessed 05/01/2014.

I also noticed here that the links for Population data and Boundaries both lead to the same page (https://data.humdata.org/), even though the links are different.

  1. the references could be improved by organizing them in the bibliography.bib file, which is already stored in the repository. This would change the current citation format, for example from:

The analysis was validated in 18 countries (Tiecke, et.al., 2017).

to: (where the link in [1] would lead to the end of the page where the references are listed).

The analysis was validated in 18 countries [1].

  1. I noticed a few typos such as missing periods, I could correct them.

If you agree with any of these suggestions, please let me know so I can create a PR with the changes 🙂

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions