Skip to content

ENH: Add a new class that provides information on each table and column #42

@linwoodc3

Description

@linwoodc3

This is part of Phase 1.

GDELT is a very complex data set and beginners will need to understand what is available. This is a multi-pronged issue as it is tied to #30 .

The implementation is up to the coder who takes this on, but for consideration:

  • Create a class that is an "information" or "whatIs" class. The name of the class should be easy to understand and let the user know to use this specific class to learn more about tables and column names in tables.

  • Each table for GDELT (events, gkg,iatv,mentions,literature) should have a method that returns a description of the table. GDELT Codebook descriptions may help give a generic overview of tables .

  • Each table will need to include different descriptions for the different GDELT version (version 1 and version 2). The main difference is that new columns or improvements should be highlighted in the description. For example, Events 1 table has less columns than Events 2 table. The description will explain why briefly (maybe 1 sentence at the beginning of the description of Events 2).

  • Each table will have a dataframe that provides a description of the columns. Each column will have a name, data type (integer, string, etc.), and a description.

  • Write a unit test to test each table; start by writing failing unit tests firsts (to load the table), then go back and make the tables load with the descriptions. We must have a unit test for each table.

A potential tree is:
gdelt.info -> events -> columndescription

OR

gdelt.info(version=2) -> events -> tabledescription

The version should be set in gdelt.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions