Skip to content

Commit 05e6f20

Browse files
authored
added the test_set analysis for checking the trends for pv_ids (#69)
* added the analysis for each pv_id trends * added documentation for test_analysis
1 parent e437d0b commit 05e6f20

File tree

3 files changed

+238
-1
lines changed

3 files changed

+238
-1
lines changed

images/test_analysis_output.png

237 KB
Loading

quartz_solar_forecast/dataset/README.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,18 @@ By analysing the metadata, available at [Hugging Face](https://huggingface.co/da
77
1. Most of the data in the test set has a tilt angle of 30-34 degrees
88
2. The maximum kwp is 4.0 & the minmum kwp is 2.25 in the test set.
99

10-
A detailed anaysis of the test set can be found at quartz_solar_forecast/dataset/dataset_analysis/test_set_analysis.ipynb
10+
A detailed anaysis of the test set can be found at quartz_solar_forecast/dataset/dataset_analysis/test_set_analysis.ipynb
11+
12+
### `test_set_analysis_pv_id_vs_month.ipynb`
13+
This file uses `testset.csv`, which consists of data from 50 photovoltaic systems represented by unique `pv_id`. Each `pv_id` has 50 data points collected at times represented by `timestamp`. The dataset was analyzed to observe the distribution trends of data points during different months of the year for each PV ID.
14+
The following scatter plot shows the distribution of data points for each PV ID across the months of the year.
15+
16+
![PV ID vs. Month Distribution](https://github.com/openclimatefix/Open-Source-Quartz-Solar-Forecast/blob/main/images/test_analysis_output.png?raw=true)
17+
18+
The following observations were made from the plot:
19+
20+
- **Distribution of Data Points**: The plot displays data points for all months across multiple PV IDs. Each dot signifies an instance of electricity generation data recorded from a PV system.
21+
22+
- **Frequency of Data Points**: The color intensity on the scatter plot corresponds to the frequency of data points for each PV ID and month. Lighter shades represent a lower number of data points, whereas darker shades signify a higher frequency. Notably, the months of May, June, July, August, and September are marked by darker shades, indicating a higher frequency of data points compared to the rest of the year.
23+
24+
- **Uniformity Across Months**: Data points are distributed fairly evenly across the months for each PV ID, which implies that data collection is consistent throughout the year without significant lapses.

quartz_solar_forecast/dataset/dataset_analysis/test_set_analysis_pv_id_vs_month.ipynb

Lines changed: 223 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)