Skip to content

Commit 5163fce

Browse files
committed
Merge branch 'main' of github.com:UCSB-Library-Research-Data-Services/bren-eds213
2 parents 4f9b569 + 4d4f931 commit 5163fce

File tree

4 files changed

+54
-7
lines changed

4 files changed

+54
-7
lines changed

_quarto.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,9 @@ website:
3535
- modules/week05/hw-05-2.qmd
3636
- modules/week05/hw-05-3.qmd
3737
- modules/week05/hw-05-4.qmd
38-
#- modules/week06/hw-06-1.qmd
39-
#- modules/week06/hw-06-2.qmd
40-
#- modules/week06/hw-06-3.qmd
38+
- modules/week06/hw-06-1.qmd
39+
- modules/week06/hw-06-2.qmd
40+
- modules/week06/hw-06-3.qmd
4141
#- modules/week07/hw-07-1.qmd
4242
#- modules/week07/hw-07-2.qmd
4343
#- modules/week07/hw-07-3.qmd

modules/week06/hw-06-1.qmd

Lines changed: 43 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,19 +2,58 @@
22
title: "Week 6 - Little Bobby Tables"
33
---
44

5-
View this classic XKCD cartoon:
5+
**Please use Canvas to return the assignments: <https://ucsb.instructure.com/courses/26293/assignments/361496>**
6+
7+
In class we discussed how to parameterize a query and then insert values for the parameter(s):
8+
9+
```
10+
query_template = "SELECT ... WHERE Species = ? AND ageMethod = ?"
11+
species = "wolv"
12+
age_method = "float"
13+
cur.execute(query_template, [species, age_method])
14+
```
15+
16+
The bare question marks in the template are placeholders. The database driver substitutes the supplied parameter values before submitting to the database, appropriately adding any quoting and character escaping as necessary.
17+
18+
You may decide you want to use your own Python string substitution instead:
19+
20+
```
21+
query_template = "SELECT ... WHERE Species = '%s' AND ageMethod = '%s'"
22+
species = "wolv"
23+
age_method = "float"
24+
cur.execute(query_template % (species, age_method))
25+
```
26+
27+
Before you do that, recognize that this practice continues to this day to be a **major** source of security exploits. To understand why, view this classic XKCD cartoon:
628

729
![https://xkcd.com/327](exploits_of_a_mom_2x.png)
830

9-
For the purposes of this problem you may assume that at some point the school's system performs the query
31+
To interpret the above, you may assume that at some point the school's system performs the query
1032

1133
```
1234
SELECT *
1335
FROM Students
14-
WHERE (name = '%s' AND year = 2024);
36+
WHERE (name = '%s' AND ...);
1537
```
1638

17-
where a student's name, as input by a user of the system, is directly substituted for the `%s`. Explain exactly how Little Bobby Tables' "name" can cause a catastrophe. Also, explain why his name has two dashes (`--`) at the end.
39+
where a student's name, as input by a user of the system, is directly substituted for the `%s`.
40+
41+
## Part 1
42+
43+
Explain exactly how Little Bobby Tables' "name" can cause a catastrophe. Explain why his name has two hyphens (`--`) at the end.
44+
45+
## Part 2
46+
47+
Suppose instead the school system executed the query
48+
49+
```
50+
SELECT *
51+
FROM Students WHERE name = '%s';
52+
```
53+
54+
What "name" would Little Bobby Tables use to destroy things in that case?
55+
56+
**Credit: 15 points**
1857

1958
## Bonus problem!
2059

modules/week06/hw-06-2.qmd

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22
title: "Week 6 - Characterizing egg variation"
33
---
44

5+
**Please use Canvas to return the assignments: <https://ucsb.instructure.com/courses/26293/assignments/361506>**
6+
57
You read [Egg Dimensions and Neonatal Mass of Shorebirds](https://www.jstor.org/stable/1367334) by Robert E. Ricklefs and want to see how the egg data we've been using in class compares to his results. Specifically, Ricklefs reported, "Coefficients of variation were 4 to 9% for egg volume" for shorebird eggs gathered in Manitoba, Canada. What is the range of coefficients of variation in our ASDN dataset?
68

79
The "coefficient of variation," or CV, is a unitless measure of the variation of a sample, defined as the standard deviation divided by the mean and multiplied by 100 to express as a percentage. Thus, a CV of 10% means the standard deviation is 10% of the mean value. For the purposes of this computation, we will copy Ricklefs and use as a proxy for egg volume the formula
@@ -72,6 +74,8 @@ Pluvialis dominica 19.88%
7274
Pluvialis squatarola 6.94%
7375
```
7476

77+
**Credit: 55 points**
78+
7579
# Appendix
7680

7781
It's not necessary to use `pd.read_sql` to get data into a dataframe, it's just a convenience. To do so manually (and to show you it's not that hard), imagine that your query returns three columns. Collect the row data into three separate lists, then manually create a dataframe specifying the contents as a dictionary:

modules/week06/hw-06-3.qmd

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22
title: "Week 6 - Who were the winners?"
33
---
44

5+
**Please use Canvas to return the assignments: <https://ucsb.instructure.com/courses/26293/assignments/361508>**
6+
57
At the conclusion of the ASDN project the PIs decided to hand out first, second, and third prizes to the observers who measured the most eggs. Who won? Please use R and dbplyr to answer this question, and please submit your R code. Your code should print out:
68

79
```
@@ -22,3 +24,5 @@ egg_table <- tbl(conn, "Bird_eggs")
2224
and then use tidyverse grouping, summarization, joining, and other functions to compute the desired result.
2325

2426
Also, take your final expression and pipe it into `show_query()`. If you used multiple R statements, did dbplyr create a temporary table, or did it manage to do everything in one query? Did it limit to the first three rows using an R expression or an SQL LIMIT clause?
27+
28+
**Credit: 30 points**

0 commit comments

Comments
 (0)