-
Notifications
You must be signed in to change notification settings - Fork 56
Description
This is part of Phase 1.
GDELT is a very complex data set and beginners will need to understand what is available. This is a multi-pronged issue as it is tied to #30 .
The implementation is up to the coder who takes this on, but for consideration:
-
Create a class that is an "information" or "whatIs" class. The name of the class should be easy to understand and let the user know to use this specific class to learn more about tables and column names in tables.
-
Each table for GDELT (
events
,gkg
,iatv
,mentions
,literature
) should have a method that returns a description of the table. GDELT Codebook descriptions may help give a generic overview of tables .- The
csv
andjson
files here will be used
- The
-
Each table will need to include different descriptions for the different GDELT version (version 1 and version 2). The main difference is that new columns or improvements should be highlighted in the description. For example, Events 1 table has less columns than Events 2 table. The description will explain why briefly (maybe 1 sentence at the beginning of the description of Events 2).
-
Each table will have a dataframe that provides a description of the columns. Each column will have a name, data type (integer, string, etc.), and a description.
-
Write a unit test to test each table; start by writing failing unit tests firsts (to load the table), then go back and make the tables load with the descriptions. We must have a unit test for each table.
A potential tree is:
gdelt.info -> events -> columndescription
OR
gdelt.info(version=2) -> events -> tabledescription
The version should be set in gdelt.