Skip to content

Commit 20e7304

Browse files
Changed the output to dataframes
Cleaned up the functions, added back dependencias to DataFRames, JSONTables and CSV and now get_nass returns DF for JSON and CSV. Cleaned up and updated the README.md. Added compat bounds
1 parent a26c437 commit 20e7304

File tree

6 files changed

+51
-64
lines changed

6 files changed

+51
-64
lines changed

Project.toml

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,19 @@ authors = ["H. Alejandro Merchan", "California Department of Pesticide Regulatio
44
version = "0.1.0"
55

66
[deps]
7+
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
8+
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
79
HTTP = "cd3eb016-35fb-5094-929b-558a96fad6f3"
810
JSON3 = "0f8b85d8-7281-11e9-16c2-39a750bddbf1"
11+
JSONTables = "b9914132-a727-11e9-1322-f18e41205b0b"
912

1013
[compat]
14+
CSV = "0.6"
15+
DataFrames = "0.21"
1116
HTTP = "0.8"
12-
julia = "1"
1317
JSON3 = "1"
18+
JSONTables = "1"
19+
julia = "1"
1420

1521
[extras]
1622
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"

README.md

Lines changed: 27 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,6 @@
88
```@julia
99
add USDAQuickStats
1010
```
11-
1211
## Index
1312

1413
The package contains following functions:
@@ -19,22 +18,22 @@ The package contains following functions:
1918
- `get_nass`
2019

2120
## Tutorial and Workflow
22-
### Set up an Envionment Variable for the NASS API key
21+
### Set up an Environment Variable for the NASS API key
2322

24-
To start using the API, you first need to get a **personal API key**.
23+
To start using the API, the user first needs to get a **personal API key**.
2524

26-
You can request a NASS API key at [https://quickstats.nass.usda.gov/api](https://quickstats.nass.usda.gov/api).
25+
The user can request a NASS API key at [https://quickstats.nass.usda.gov/api](https://quickstats.nass.usda.gov/api).
2726

28-
Once you receive your key, you can either set it up as an environment variable called USDA_QUICK_SURVEY_KEY" or set it up during a new julia session with
27+
The API key can be saved as an environment variable called "USDA_QUICK_SURVEY_KEY" or used during each new Julia session by setting it up using:
2928

3029
```@julia
3130
using USDAQuickStats
3231
set_api_key("YOUR_KEY"::String)
3332
```
3433

35-
where you manually replace `YOUR_KEY` with your private API key.
34+
replacing `"YOUR_KEY"` with the private API key as a string.
3635

37-
If you are constantly using the database, you might want to make your key into a permanent variable in your environment.
36+
Saving the key into a permanent variable in your environment is dependent on the operating system.
3837

3938
### Query the database
4039

@@ -46,45 +45,23 @@ The API for the Quick Stats database provides three main functions:
4645

4746
**get_nass**
4847

48+
`get_nass(args...; format="json")
49+
`
4950
The main function is `get_nass`, which queries the main USDA Quick Stats database.
5051

51-
The description of the different fields for the database is available [here].(https://quickstats.nass.usda.gov/api)
52+
`args...` is a list of the different headers from the database that can be queried. Each argument is a string with the name of the header and the value from that header in uppercase, e.g. `"header=VALUE`. The description of the different headers (also called columns) for the database is available [here].(https://quickstats.nass.usda.gov/api)
5253

53-
In this example I queried the survey data for oranges in California (CA) for the year 2019. I'm interested in the variables "ACRES BEARING" and "PRICE RECEIVED".
54+
The `format` keyword can be added to the query after a semicolon `;` and defines the format of the response. It is set to `JSON` as a default, other formats provided by the database are `CSV` and `XML`.
5455

55-
```@julia
56-
query = get_nass("source_desc=SURVEY","commodity_desc=ORANGES","state_alpha=CA", "year=2019","statisticcat_desc=AREA%20BEARING","statisticcat_desc=PRICE%20RECEIVED")
57-
```
58-
output
56+
The function returns a DataFrame with the requested query for the `JSON` and `CSV` formats, no DataFrame has been implemented for the `XML` format yet, PR's welcome.
5957

60-
```@julia
61-
JSON3.Object{Array{UInt8,1},Array{UInt64,1}} with 1 entry:
62-
:data => JSON3.Object[{…
63-
```
64-
65-
The function produces a JSON object by default which can be saved and parsed in different ways.
58+
In the following example, the survey data for oranges in California (CA) for the year 2019 was queried for information about the headers "ACRES BEARING" and "PRICE RECEIVED".
6659

67-
The `format` keyword can be added to the query after a semicolon `;` to request other formats outputs, CSV and XML are also available.
60+
Notice that header values that have spaces in them need to be passed with the symbol `%20` replacing the space. In general, no spaces are allowed in the query.
6861

6962
```@julia
70-
query = get_nass("source_desc=SURVEY","commodity_desc=ORANGES","state_alpha=CA", "year=2019","statisticcat_desc=AREA%20BEARING","statisticcat_desc=PRICE%20RECEIVED"; format="CSV")
71-
```
72-
73-
The purpose of the package is to query the database and the user will perform any further manipulation of the resulting object.
74-
75-
For example, to read the JSON object into a DataFrame, the user can use the following packages:
76-
- [DataFrames](https://github.com/JuliaData/DataFrames.jl)
77-
- [JSONTables](https://github.com/JuliaData/JSONTables.jl)
78-
- [JSON3](https://github.com/quinnj/JSON3.jl)
79-
80-
And do something like this:
81-
82-
```@julia
83-
using DataFrames, JSONTables, JSON3
8463
query = get_nass("source_desc=SURVEY","commodity_desc=ORANGES","state_alpha=CA", "year=2019","statisticcat_desc=AREA%20BEARING","statisticcat_desc=PRICE%20RECEIVED")
85-
DataFrames.DataFrame(JSONTables.jsontable(query)[:data]))
8664
```
87-
8865
output
8966

9067
```@julia
@@ -102,10 +79,10 @@ output
10279

10380
**get_param_values**
10481

105-
`get_param_values` is a helper query that allow user to check the values of a parameter in the query. This is useful when constructing different query strings.
82+
`get_param_values(arg)` is a helper query that allow user to check the values of a field `arg` from the database. This is useful when constructing different query strings, as it allows the user to determine which values are available on each field.
10683

10784
```@julia
108-
get_param_values("sector_desc")
85+
db_values = get_param_values("sector_desc")
10986
```
11087

11188
output
@@ -115,12 +92,20 @@ JSON3.Object{Array{UInt8,1},Array{UInt64,1}} with 1 entry:
11592
:sector_desc => ["ANIMALS & PRODUCTS", "CROPS", "DEMOGRAPHICS", "ECONOMICS", "ENVIRONMENTAL"]
11693
```
11794

95+
If the user need to access the values, they are available as an array `db_values[:sector_desc]`.
96+
11897
**get_counts**
11998

120-
`get_counts` is a helper query that allows user to check the number of records a query will produce before performing the query. This is important because the USDA Quick Stats API has a limit of 50,000 records per query. Any query requesting a number of records larger than this limit will fail.
99+
`get_counts(args...)` is a helper query that allows user to check the number of records a query using the fields in `args...` will produce before performing the query. This is important because the USDA Quick Stats API has a limit of 50,000 records per query. Any query requesting a number of records larger than this limit will fail.
100+
101+
As in `get_nass`, `args...` is a list of the different headers from the database that can be queried. Each argument is a string with the name of the header and the value from that header in uppercase, e.g. `"header=VALUE`. The description of the different headers (also called columns) for the database is available [here].(https://quickstats.nass.usda.gov/api)
102+
103+
In the following example, the number of records for survey data for oranges in California (CA) for the year 2019 with information about the headers "ACRES BEARING" and "PRICE RECEIVED" was queried.
104+
105+
Notice that header values that have spaces in them need to be passed with the symbol `%20` replacing the space. In general, no spaces are allowed in the query.
121106

122107
```@julia
123-
get_counts("source_desc=SURVEY","commodity_desc=ORANGES","state_alpha=CA", "year=2019","statisticcat_desc=AREA%20BEARING","statisticcat_desc=PRICE%20RECEIVED")
108+
count = get_counts("source_desc=SURVEY","commodity_desc=ORANGES","state_alpha=CA", "year=2019","statisticcat_desc=AREA%20BEARING","statisticcat_desc=PRICE%20RECEIVED")
124109
```
125110

126111
output
@@ -130,6 +115,8 @@ JSON3.Object{Array{UInt8,1},Array{UInt64,1}} with 1 entry:
130115
:count => 276
131116
```
132117

118+
Same as before, the value can be accessed as an array `count[:count]`.
119+
133120
A very large query would be for example:
134121

135122
```@julia

src/USDAQuickStats.jl

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,11 @@
11
module USDAQuickStats
22

3-
import HTTP
4-
import JSON3: read
3+
import CSV
4+
import DataFrames: DataFrame
5+
import HTTP: request
6+
import JSON3
7+
import JSONTables: jsontable
8+
59

610
export
711
set_api_key,

src/counts.jl

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,5 @@ function get_counts(args...)
1111

1212
url = string(header, query)
1313

14-
r = HTTP.request("GET", url)
15-
read(r.body)
14+
JSON3.read(request("GET", url).body)
1615
end

src/nass.jl

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -12,14 +12,13 @@ function get_nass(args...; format="json")
1212
query *= arg
1313
end
1414

15-
url = string(header, query)
16-
17-
r = HTTP.request("GET", url)
18-
read(r.body)
15+
if uppercase(format) == "JSON"
16+
r = request("GET", string(header, query)).body
17+
DataFrame(jsontable(JSON3.read(r)[:data]))
18+
elseif uppercase(format) == "CSV"
19+
r = request("GET", string(header, query)).body
20+
CSV.read(r)
21+
else
22+
r = request("GET", string(header, query))
23+
end
1924
end
20-
21-
# TODO
22-
# Implement reading functions?
23-
#function return_table(json_object)
24-
# DataFrames.DataFrame(JSONTables.jsontable(JSON3.read(json_object.body)[:data]))
25-
#end

src/param_values.jl

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,5 @@ function get_param_values(arg)
22
key = ENV["USDA_QUICK_SURVEY_KEY"]
33
url = string(usda_url, "/api/get_param_values/?key=", key, "&param=", arg)
44

5-
r = HTTP.request("GET", url)
6-
#r
7-
read(r.body)
5+
JSON3.read(request("GET", url).body)
86
end
9-
10-
# TODO
11-
# Implement reading functions?
12-
#function return_param_values(json_object, arg::Symbol)
13-
# json_object[arg]
14-
#end

0 commit comments

Comments
 (0)