Skip to content

Commit c33b481

Browse files
Merge branch 'v_0.2'
2 parents e41a441 + cdf0961 commit c33b481

File tree

6 files changed

+83
-47
lines changed

6 files changed

+83
-47
lines changed

Project.toml

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,18 +4,10 @@ authors = ["H. Alejandro Merchan", "California Department of Pesticide Regulatio
44
version = "0.1.2"
55

66
[deps]
7-
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
8-
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
97
HTTP = "cd3eb016-35fb-5094-929b-558a96fad6f3"
10-
JSON3 = "0f8b85d8-7281-11e9-16c2-39a750bddbf1"
11-
JSONTables = "b9914132-a727-11e9-1322-f18e41205b0b"
128

139
[compat]
14-
CSV = "0.6, 0.7"
15-
DataFrames = "0.21, 0.22"
1610
HTTP = "0.8, 0.9"
17-
JSON3 = "1"
18-
JSONTables = "1"
1911
julia = "1"
2012

2113
[extras]

README.md

Lines changed: 79 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -53,9 +53,9 @@ The main function is `get_nass`, which queries the main USDA Quick Stats databas
5353

5454
The `format` keyword can be added to the query after a semicolon `;` and defines the format of the response. It is set to `JSON` as a default, other formats provided by the database are `CSV` and `XML`.
5555

56-
The function returns a DataFrame with the requested query for the `JSON` and `CSV` formats, no DataFrame has been implemented for the `XML` format yet, PR's welcome.
56+
The function returns a HTTP.request object and the user can parse it using different packages, some examples below.
5757

58-
In the following example, the survey data for oranges in California (CA) for the year 2019 was queried for information about the headers "ACRES BEARING" and "PRICE RECEIVED".
58+
In the following example, the survey data for oranges in California (CA) for the year 2019 was queried for information about the headers "ACRES BEARING" and "PRICE RECEIVED". The format keyword isn't specified, so the request will return a JSON file.
5959

6060
Notice that header values that have spaces in them need to be passed with the symbol `%20` replacing the space. In general, no spaces are allowed in the query.
6161

@@ -65,18 +65,40 @@ query = get_nass("source_desc=SURVEY","commodity_desc=ORANGES","state_alpha=CA",
6565
output
6666

6767
```@julia
68-
276×39 DataFrame. Omitted printing of 30 columns
69-
│ Row │ prodn_practice_desc │ state_name │ country_name │ asd_desc │ watershed_code │ state_fips_code │ source_desc │ location_desc │ statisticcat_desc │
70-
│ │ String │ String │ String │ String │ String │ String │ String │ String │ String │
71-
├─────┼──────────────────────────┼────────────┼───────────────┼──────────┼────────────────┼─────────────────┼─────────────┼───────────────┼───────────────────┤
72-
│ 1 │ ALL PRODUCTION PRACTICES │ CALIFORNIA │ UNITED STATES │ │ 00000000 │ 06 │ SURVEY │ CALIFORNIA │ AREA BEARING │
73-
│ 2 │ ALL PRODUCTION PRACTICES │ CALIFORNIA │ UNITED STATES │ │ 00000000 │ 06 │ SURVEY │ CALIFORNIA │ PRICE RECEIVED │
68+
HTTP.Messages.Response:
69+
"""
70+
HTTP/1.1 200 OK
71+
Date: Sat, 26 Dec 2020 19:36:55 GMT
72+
Server: Apache/2.4.23 (Linux/SUSE)
73+
X-Frame-Options: SAMEORIGIN
74+
Content-Length: 274515
75+
Cache-Control: max-age=86400, private
76+
Connection: close
77+
Content-Type: application/json
78+
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
79+
80+
{"data":[{"begin_code":"00","prodn_practice_desc":"ALL PRODUCTION PRACTICES","watershed_desc":"","state_fips_code":"06","commodity_desc":"ORANGES","statisticcat_desc":"AREA BEARING","Value":"147,000","watershed_code":"00000000","source_desc":"SURVEY","util_practice_desc":"ALL UTILIZATION PRACTICES","domaincat_desc":"NOT SPECIFIED","domain_desc":"TOTAL","state_alpha":"CA","week_ending":"","group_desc":"FRUIT & TREE NUTS","reference_period_desc":"YEAR","CV (%)":"","year":2019,"short_desc":"ORANGES - ACRES BEARING","country_code":"9000","load_time":"2019-08-28 15:09:57","country_name":"UNITED STATES","unit_desc":"ACRES","county_code":"","end_code":"00","sector_desc":"CROPS","state_name":"CALIFORNIA","zip_5":"","class_desc":"ALL CLASSES","county_ansi":"","asd_code":"","location_desc":"CALIFORNIA","congr_district_code":"","county_name":"","state_ansi":"06","region_desc":"","asd_desc":"","freq_desc":"ANNUAL","agg_level_desc":"STATE"},{"reference_period_desc":"MARKETING YEAR","CV (%)":"","yea
7481
75-
│ 274 │ ALL PRODUCTION PRACTICES │ CALIFORNIA │ UNITED STATES │ │ 00000000 │ 06 │ SURVEY │ CALIFORNIA │ PRICE RECEIVED │
76-
│ 275 │ ALL PRODUCTION PRACTICES │ CALIFORNIA │ UNITED STATES │ │ 00000000 │ 06 │ SURVEY │ CALIFORNIA │ PRICE RECEIVED │
77-
│ 276 │ ALL PRODUCTION PRACTICES │ CALIFORNIA │ UNITED STATES │ │ 00000000 │ 06 │ SURVEY │ CALIFORNIA │ PRICE RECEIVED │
82+
274515-byte body
83+
"""
7884
```
7985

86+
This query object can be post-processed in different ways, depending on the format. One possible option is to return the query as a CSV file and read it into a DataFrame.
87+
88+
```@julia
89+
using CSV
90+
using DataFrames
91+
92+
query = get_nass("source_desc=SURVEY","commodity_desc=ORANGES","state_alpha=CA", "year=2019","statisticcat_desc=AREA%20BEARING","statisticcat_desc=PRICE%20RECEIVED"; format="csv")
93+
94+
# Display as DataFrame
95+
CSV.File(query.body, DataFrame)
96+
97+
# Or save it to disk
98+
CSV.write("query.csv", CSV.File(query.body))
99+
```
100+
101+
80102
**get_param_values**
81103

82104
`get_param_values(arg)` is a helper query that allow user to check the values of a field `arg` from the database. This is useful when constructing different query strings, as it allows the user to determine which values are available on each field.
@@ -88,11 +110,26 @@ db_values = get_param_values("sector_desc")
88110
output
89111

90112
```@julia
91-
JSON3.Object{Array{UInt8,1},Array{UInt64,1}} with 1 entry:
92-
:sector_desc => ["ANIMALS & PRODUCTS", "CROPS", "DEMOGRAPHICS", "ECONOMICS", "ENVIRONMENTAL"]
113+
HTTP.Messages.Response:
114+
"""
115+
HTTP/1.1 200 OK
116+
Date: Sat, 26 Dec 2020 20:40:29 GMT
117+
Server: Apache/2.4.23 (Linux/SUSE)
118+
X-Frame-Options: SAMEORIGIN
119+
Content-Length: 89
120+
Cache-Control: max-age=86400, private
121+
Connection: close
122+
Content-Type: application/json
123+
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
124+
125+
{"sector_desc":["ANIMALS & PRODUCTS","CROPS","DEMOGRAPHICS","ECONOMICS","ENVIRONMENTAL"]}"""
93126
```
127+
The query object can be post processed using the JSON3 package to obtain a more readable output if needed.
128+
```@julia
129+
using JSON3
94130
95-
If the user need to access the values, they are available as an array `db_values[:sector_desc]`.
131+
JSON3.read(db_values.body)
132+
```
96133

97134
**get_counts**
98135

@@ -111,11 +148,22 @@ count = get_counts("source_desc=SURVEY","commodity_desc=ORANGES","state_alpha=CA
111148
output
112149

113150
```@julia
114-
JSON3.Object{Array{UInt8,1},Array{UInt64,1}} with 1 entry:
115-
:count => 276
151+
HTTP.Messages.Response:
152+
"""
153+
HTTP/1.1 200 OK
154+
Date: Sat, 26 Dec 2020 20:47:55 GMT
155+
Server: Apache/2.4.23 (Linux/SUSE)
156+
X-Frame-Options: SAMEORIGIN
157+
Content-Length: 13
158+
Cache-Control: max-age=86400, private
159+
Connection: close
160+
Content-Type: application/json
161+
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
162+
163+
{"count":276}"""
116164
```
117165

118-
Same as before, the value can be accessed as an array `count[:count]`.
166+
Same as before, the object can be processed with the JSON3 package to get a more readable output.
119167

120168
A very large query would be for example:
121169

@@ -126,9 +174,21 @@ get_counts("source_desc=SURVEY", "year=2019")
126174
output
127175

128176
```@julia
129-
JSON3.Object{Array{UInt8,1},Array{UInt64,1}} with 1 entry:
130-
:count => 381929
177+
HTTP.Messages.Response:
178+
"""
179+
HTTP/1.1 200 OK
180+
Date: Sat, 26 Dec 2020 20:49:14 GMT
181+
Server: Apache/2.4.23 (Linux/SUSE)
182+
X-Frame-Options: SAMEORIGIN
183+
Content-Length: 16
184+
Cache-Control: max-age=86400, private
185+
Connection: close
186+
Content-Type: application/json
187+
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
188+
189+
{"count":448878}"""
131190
```
191+
This query would fail if ran directly using the `get_nass` function, because it exceeds the limit of 50000 rows.
132192

133193
I would like to thank @markushhh, because I heavily used his [FredApi.jl](https://github.com/markushhh/FredApi.jl) for inspiration. And sometimes blatant plagiarism.
134194

src/USDAQuickStats.jl

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,14 @@
11
module USDAQuickStats
22

3-
import CSV
4-
import DataFrames: DataFrame
53
import HTTP: request
6-
import JSON3
7-
import JSONTables: jsontable
8-
94

105
export
116
set_api_key,
127
get_counts,
138
get_param_values,
149
get_nass
1510

16-
const usda_url = " http://quickstats.nass.usda.gov"
11+
const usda_url = "http://quickstats.nass.usda.gov"
1712

1813
include("set_api_key.jl")
1914
include("nass.jl")

src/counts.jl

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,5 @@ function get_counts(args...)
99
query *= arg
1010
end
1111

12-
url = string(header, query)
13-
14-
JSON3.read(request("GET", url).body)
12+
request("GET", string(header, query))
1513
end

src/nass.jl

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,5 @@ function get_nass(args...; format="json")
1212
query *= arg
1313
end
1414

15-
if uppercase(format) == "JSON"
16-
r = request("GET", string(header, query)).body
17-
DataFrame(jsontable(JSON3.read(r)[:data]))
18-
elseif uppercase(format) == "CSV"
19-
r = request("GET", string(header, query)).body
20-
CSV.File(r) |> DataFrame
21-
else
22-
r = request("GET", string(header, query))
23-
end
15+
request("GET", string(header, query))
2416
end

src/param_values.jl

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
function get_param_values(arg)
22
key = ENV["USDA_QUICK_SURVEY_KEY"]
3-
url = string(usda_url, "/api/get_param_values/?key=", key, "&param=", arg)
43

5-
JSON3.read(request("GET", url).body)
4+
request("GET", string(usda_url, "/api/get_param_values/?key=", key, "&param=", arg))
65
end

0 commit comments

Comments
 (0)