Skip to content

Commit ab0782c

Browse files
authored
Merge pull request #223 from networktocode/Release_v2.2.3
Release v2.2.3
2 parents 5713a77 + 228600a commit ab0782c

24 files changed

+1764
-78
lines changed

CHANGELOG.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,18 @@
11
# Changelog
22

3+
## v2.2.3 - 2023-03-21
4+
5+
### Changed
6+
7+
- #216 - Allow Lumen maintenance multiple windows to be parsed
8+
- #212 - Updated documentation: Contribution section
9+
- #210 - Ability to parse multiple maintenance windows from Zayo
10+
- #190 - Update Telstra for new notificaiton format
11+
12+
### Fixed
13+
14+
- #222 - Fix e22 tests when combining data from multiple maintenances
15+
316
## v2.2.2 - 2023-01-27
417

518
### Changed

README.md

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -312,6 +312,63 @@ The project is following Network to Code software development guidelines and is
312312
- The `Provider` also supports the definition of a `_include_filter` and a `_exclude_filter` to limit the notifications that are actually processed, avoiding false positive errors for notification that are not relevant.
313313
4. Update the `unit/test_e2e.py` with the new provider, providing some data to test and validate the final `Maintenances` created.
314314
5. **Expose the new `Provider` class** updating the map `SUPPORTED_PROVIDERS` in `circuit_maintenance_parser/__init__.py` to officially expose the `Provider`.
315+
6. You can run some tests here to verify that your new unit tests do not cause issues with existing tests, and in general they work as expected. You can do this by running `pytest --log-cli-level=DEBUG --capture=tee-sys`. You can narrow down the tests that you want to execute with the `-k` flag. If successful, your results should look similar to the following:
316+
317+
```
318+
-> % pytest --log-cli-level=DEBUG --capture=tee-sys -k test_parsers
319+
...omitted debug logs...
320+
====================================================== 99 passed, 174 deselected, 17 warnings in 10.35s ======================================================
321+
```
322+
7. Run some final CI tests locally to ensure that there is no linting/formatting issues with your changes. You should look to get a code score of 10/10. See the example below: `invoke tests --local`
323+
324+
```
325+
-> % invoke tests --local
326+
LOCAL - Running command black --check --diff .
327+
All done! ✨ 🍰 ✨
328+
41 files would be left unchanged.
329+
LOCAL - Running command flake8 .
330+
LOCAL - Running command find . -name "*.py" | xargs pylint
331+
************* Module tasks
332+
tasks.py:4:0: W0402: Uses of a deprecated module 'distutils.util' (deprecated-module)
333+
334+
--------------------------------------------------------------------
335+
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)
336+
```
337+
338+
### How to debug circuit-maintenance-parser library locally
339+
340+
1. `poetry install` updates the library and its dependencies locally.
341+
2. `circuit-maintenance-parser` is now built with your recent local changes.
342+
343+
If you were to add loggers or debuggers to one of the classes:
344+
345+
```python
346+
class HtmlParserZayo1(Html):
347+
def parse_bs(self, btags: ResultSet, data: dict):
348+
"""Parse B tag."""
349+
raise Exception('Debugging exception')
350+
```
351+
352+
After running `poetry install`:
353+
354+
```
355+
-> % circuit-maintenance-parser --data-file ~/Downloads/zayo.eml --data-type email --provider-type zayo
356+
Provider processing failed: Failed creating Maintenance notification for Zayo.
357+
Details:
358+
- Processor CombinedProcessor from Zayo failed due to: Debugging exception
359+
```
360+
361+
> Note: `invoke build` will result in an error due to no Dockerfile. This is expected as the library runs simple pytest testing without a container.
362+
363+
```
364+
-> % invoke build
365+
Building image circuit-maintenance-parser:2.2.2-py3.8
366+
#1 [internal] load build definition from Dockerfile
367+
#1 transferring dockerfile: 2B done
368+
#1 DONE 0.0s
369+
WARNING: failed to get git remote url: fatal: No remote configured to list refs from.
370+
ERROR: failed to solve: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to read dockerfile: open /var/lib/docker/tmp/buildkit-mount1243547759/Dockerfile: no such file or directory
371+
```
315372

316373
## Questions
317374

circuit_maintenance_parser/parsers/lumen.py

Lines changed: 21 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
import logging
33
from typing import Dict
44

5+
from copy import deepcopy
56
from dateutil import parser
67
import bs4 # type: ignore
78
from bs4.element import ResultSet # type: ignore
@@ -19,10 +20,22 @@ class HtmlParserLumen1(Html):
1920

2021
def parse_html(self, soup):
2122
"""Execute parsing."""
23+
maintenances = []
2224
data = {}
2325
self.parse_spans(soup.find_all("span"), data)
2426
self.parse_tables(soup.find_all("table"), data)
25-
return [data]
27+
28+
# Iterates over multiple windows and duplicates other maintenance info to a new dictionary while also updating start and end times for the specific window.
29+
for window in data["windows"]:
30+
maintenance = deepcopy(data)
31+
maintenance["start"], maintenance["end"] = window
32+
del maintenance["windows"]
33+
maintenances.append(maintenance)
34+
35+
# Deleting the key after we are finished checking for multiple windows and duplicating data.
36+
del data["windows"]
37+
38+
return maintenances
2639

2740
def parse_spans(self, spans: ResultSet, data: Dict):
2841
"""Parse Span tag."""
@@ -56,8 +69,11 @@ def parse_spans(self, spans: ResultSet, data: Dict):
5669
data["stamp"] = self.dt2ts(stamp)
5770
break
5871

59-
def parse_tables(self, tables: ResultSet, data: Dict):
72+
def parse_tables(self, tables: ResultSet, data: Dict): # pylint: disable=too-many-locals
6073
"""Parse Table tag."""
74+
# Initialise multiple windows list that will be used in parse_html
75+
data["windows"] = []
76+
6177
circuits = []
6278
for table in tables:
6379
cells = table.find_all("td")
@@ -68,9 +84,10 @@ def parse_tables(self, tables: ResultSet, data: Dict):
6884
for idx in range(num_columns, len(cells), num_columns):
6985
if "GMT" in cells[idx].string and "GMT" in cells[idx + 1].string:
7086
start = parser.parse(cells[idx].string.split(" GMT")[0])
71-
data["start"] = self.dt2ts(start)
87+
start_ts = self.dt2ts(start)
7288
end = parser.parse(cells[idx + 1].string.split(" GMT")[0])
73-
data["end"] = self.dt2ts(end)
89+
end_ts = self.dt2ts(end)
90+
data["windows"].append((start_ts, end_ts))
7491
break
7592

7693
elif cells[0].string == "Customer Name":

circuit_maintenance_parser/parsers/telstra.py

Lines changed: 90 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,13 @@
11
"""Telstra parser."""
22
import logging
33
from typing import Dict, List
4-
4+
import re
55
from dateutil import parser
66
from bs4.element import ResultSet # type: ignore
77

88
from circuit_maintenance_parser.parser import Html, Impact, CircuitImpact, Status
99

10+
1011
# pylint: disable=too-many-branches
1112

1213

@@ -73,3 +74,91 @@ def parse_tables(self, tables: ResultSet, data: Dict): # pylint: disable=too-ma
7374
# First sentence containts 'Maintenance Details:' so we skip it
7475
data["summary"] = ". ".join(sentences[1:])
7576
break
77+
78+
79+
class HtmlParserTelstra2(Html):
80+
"""Notifications Parser for Telstra notifications."""
81+
82+
def parse_html(self, soup):
83+
"""Execute parsing."""
84+
data = {}
85+
self.parse_tables(soup.find_all("table"), data)
86+
return [data]
87+
88+
def add_maintenance_data(self, table: ResultSet, data: Dict):
89+
"""Populate data dict."""
90+
for strong_element in table.find_all("strong"):
91+
if not strong_element.string:
92+
continue
93+
strong_text = strong_element.string.strip()
94+
strong_sibling = strong_element.next_sibling.next_sibling
95+
if strong_text == "Reference number":
96+
data["maintenance_id"] = strong_sibling.string.strip()
97+
elif strong_text == "Start time":
98+
text_start = strong_sibling.string
99+
regex = re.search(r"\d{2}\s[a-zA-Z]{3}\s\d{4}\s\d{2}[:]\d{2}[:]\d{2}", text_start)
100+
if regex is not None:
101+
start = parser.parse(regex.group())
102+
data["start"] = self.dt2ts(start)
103+
else:
104+
data["start"] = "Not defined"
105+
elif strong_text == "End time":
106+
text_end = strong_sibling.string
107+
regex = re.search(r"\d{2}\s[a-zA-Z]{3}\s\d{4}\s\d{2}[:]\d{2}[:]\d{2}", text_end)
108+
if regex is not None:
109+
end = parser.parse(regex.group())
110+
data["end"] = self.dt2ts(end)
111+
else:
112+
data["end"] = "is not defined"
113+
elif strong_text == "Service/s under maintenance":
114+
data["circuits"] = []
115+
# TODO: This split is just an assumption of the multiple service, to be checked with more samples
116+
impacted_circuits = strong_sibling.text.split(", ")
117+
for circuit_id in impacted_circuits:
118+
data["circuits"].append(CircuitImpact(impact=Impact("OUTAGE"), circuit_id=circuit_id.strip()))
119+
elif strong_text == "Maintenance details":
120+
sentences: List[str] = []
121+
for element in strong_element.next_elements:
122+
if element.string == "Reference number":
123+
break
124+
if element.string and element.string not in ["\n", "", "\xa0"] + sentences:
125+
sentences.append(element.string)
126+
if sentences:
127+
# First sentence containts 'Maintenance Details' so we skip it
128+
data["summary"] = ". ".join(sentences[1:])
129+
130+
def parse_tables(self, tables: ResultSet, data: Dict): # pylint: disable=too-many-locals
131+
"""Parse Table tag."""
132+
for table in tables:
133+
for p_element in table.find_all("p"):
134+
# TODO: We should find a more consistent way to parse the status of a maintenance note
135+
p_text = p_element.text.lower()
136+
if "attention" in p_text:
137+
regex = re.search("[^attention ].*", p_text.strip())
138+
if regex is not None:
139+
data["account"] = regex.group()
140+
else:
141+
data["account"] = "not Found"
142+
for span_element in table.find_all("span"):
143+
span_text = span_element.text.lower()
144+
if "planned maintenance to our network infrastructure" in span_text:
145+
data["status"] = Status("CONFIRMED")
146+
elif "emergency maintenance to our network infrastructure" in span_text:
147+
data["status"] = Status("CONFIRMED")
148+
elif "has been rescheduled" in span_text:
149+
data["status"] = Status("RE-SCHEDULED")
150+
elif "has been completed successfully" in span_text:
151+
data["status"] = Status("COMPLETED")
152+
elif (
153+
"did not proceed" in span_text
154+
or "has been withdrawn" in span_text
155+
or "has been cancelled" in span_text
156+
):
157+
data["status"] = Status("CANCELLED")
158+
elif "was unsuccessful" in span_text:
159+
data["status"] = Status("CANCELLED")
160+
else:
161+
continue
162+
break
163+
self.add_maintenance_data(table, data)
164+
break

circuit_maintenance_parser/parsers/zayo.py

Lines changed: 30 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
"""Zayo parser."""
22
import logging
33
import re
4+
from copy import deepcopy
45
from typing import Dict
56

67
import bs4 # type: ignore
@@ -44,21 +45,30 @@ class HtmlParserZayo1(Html):
4445

4546
def parse_html(self, soup):
4647
"""Execute parsing."""
48+
maintenances = []
4749
data = {}
4850
self.parse_bs(soup.find_all("b"), data)
4951
self.parse_tables(soup.find_all("table"), data)
5052

51-
if data:
52-
if "status" not in data:
53-
text = soup.get_text()
54-
if "will be commencing momentarily" in text:
55-
data["status"] = Status("IN-PROCESS")
56-
elif "has been completed" in text or "has closed" in text:
57-
data["status"] = Status("COMPLETED")
58-
elif "has rescheduled" in text:
59-
data["status"] = Status("RE-SCHEDULED")
53+
if not data:
54+
return [{}]
6055

61-
return [data]
56+
if "status" not in data:
57+
text = soup.get_text()
58+
if "will be commencing momentarily" in text:
59+
data["status"] = Status("IN-PROCESS")
60+
elif "has been completed" in text or "has closed" in text:
61+
data["status"] = Status("COMPLETED")
62+
elif "has rescheduled" in text:
63+
data["status"] = Status("RE-SCHEDULED")
64+
65+
for maintenance_window in data.get("windows", []):
66+
maintenance = deepcopy(data)
67+
maintenance["start"], maintenance["end"] = maintenance_window
68+
del maintenance["windows"]
69+
maintenances.append(maintenance)
70+
71+
return maintenances
6272

6373
def parse_bs(self, btags: ResultSet, data: dict):
6474
"""Parse B tag."""
@@ -71,41 +81,23 @@ def parse_bs(self, btags: ResultSet, data: dict):
7181
data["status"] = Status("CONFIRMED")
7282
elif "has cancelled" in line.text.lower():
7383
data["status"] = Status("CANCELLED")
74-
# Some Zayo notifications may include multiple activity dates.
75-
# For lack of a better way to handle this, we consolidate these into a single extended activity range.
76-
#
77-
# For example, given:
78-
#
79-
# 1st Activity Date
80-
# 01-Nov-2021 00:01 to 01-Nov-2021 05:00 ( Mountain )
81-
# 01-Nov-2021 06:01 to 01-Nov-2021 11:00 ( GMT )
82-
#
83-
# 2nd Activity Date
84-
# 02-Nov-2021 00:01 to 02-Nov-2021 05:00 ( Mountain )
85-
# 02-Nov-2021 06:01 to 02-Nov-2021 11:00 ( GMT )
86-
#
87-
# 3rd Activity Date
88-
# 03-Nov-2021 00:01 to 03-Nov-2021 05:00 ( Mountain )
89-
# 03-Nov-2021 06:01 to 03-Nov-2021 11:00 ( GMT )
90-
#
91-
# our end result would be (start: "01-Nov-2021 06:01", end: "03-Nov-2021 11:00")
9284
elif "activity date" in line.text.lower():
9385
logger.info("Found 'activity date': %s", line.text)
86+
87+
if "windows" not in data:
88+
data["windows"] = []
89+
9490
for sibling in line.next_siblings:
9591
text = sibling.text if isinstance(sibling, bs4.element.Tag) else sibling
9692
logger.debug("Checking for GMT date/timestamp in sibling: %s", text)
93+
9794
if "( GMT )" in text:
9895
window = self.clean_line(sibling).strip("( GMT )").split(" to ")
9996
start = parser.parse(window.pop(0))
100-
start_ts = self.dt2ts(start)
101-
# Keep the earliest of any listed start times
102-
if "start" not in data or data["start"] > start_ts:
103-
data["start"] = start_ts
10497
end = parser.parse(window.pop(0))
98+
start_ts = self.dt2ts(start)
10599
end_ts = self.dt2ts(end)
106-
# Keep the latest of any listed end times
107-
if "end" not in data or data["end"] < end_ts:
108-
data["end"] = end_ts
100+
data["windows"].append((start_ts, end_ts))
109101
break
110102
elif line.text.lower().strip().startswith("reason for maintenance:"):
111103
data["summary"] = self.clean_line(line.next_sibling)
@@ -148,13 +140,15 @@ def parse_tables(self, tables: ResultSet, data: Dict):
148140
"Customer Circuit ID",
149141
],
150142
)
143+
151144
if all(table_headers != expected_headers for expected_headers in expected_headers_ref):
152145
logger.warning("Table headers are not as expected: %s", head_row)
153146
continue
154147

155148
data_rows = table.find_all("td")
156149
if len(data_rows) % 5 != 0:
157150
raise AssertionError("Table format is not correct")
151+
158152
number_of_circuits = int(len(data_rows) / 5)
159153
for idx in range(number_of_circuits):
160154
data_circuit = {}
@@ -165,5 +159,6 @@ def parse_tables(self, tables: ResultSet, data: Dict):
165159
elif "no expected impact" in impact.lower():
166160
data_circuit["impact"] = Impact("NO-IMPACT")
167161
circuits.append(CircuitImpact(**data_circuit))
162+
168163
if circuits:
169164
data["circuits"] = circuits

circuit_maintenance_parser/provider.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
SubjectParserSeaborn2,
3636
)
3737
from circuit_maintenance_parser.parsers.sparkle import HtmlParserSparkle1
38-
from circuit_maintenance_parser.parsers.telstra import HtmlParserTelstra1
38+
from circuit_maintenance_parser.parsers.telstra import HtmlParserTelstra1, HtmlParserTelstra2
3939
from circuit_maintenance_parser.parsers.turkcell import HtmlParserTurkcell1
4040
from circuit_maintenance_parser.parsers.verizon import HtmlParserVerizon1
4141
from circuit_maintenance_parser.parsers.zayo import HtmlParserZayo1, SubjectParserZayo1
@@ -330,6 +330,7 @@ class Telstra(GenericProvider):
330330

331331
_processors: List[GenericProcessor] = [
332332
SimpleProcessor(data_parsers=[ICal]),
333+
CombinedProcessor(data_parsers=[EmailDateParser, HtmlParserTelstra2]),
333334
CombinedProcessor(data_parsers=[EmailDateParser, HtmlParserTelstra1]),
334335
]
335336
_default_organizer = "gpen@team.telstra.com"

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[tool.poetry]
22
name = "circuit-maintenance-parser"
3-
version = "2.2.2"
3+
version = "2.2.3"
44
description = "Python library to parse Circuit Maintenance notifications and return a structured data back"
55
authors = ["Network to Code <opensource@networktocode.com>"]
66
license = "Apache-2.0"

0 commit comments

Comments
 (0)