Skip to content

Commit 73af687

Browse files
committed
implementing immutable type, + small code fixes and changes to immutable/bflw files
1 parent a2c02d7 commit 73af687

18 files changed

+1366
-991
lines changed

Cargo.lock

Lines changed: 1 addition & 10 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "betfair_data"
3-
version = "0.2.0"
3+
version = "0.2.1"
44
edition = "2021"
55

66
[lib]
@@ -23,7 +23,7 @@ chrono = "0.4.19"
2323
simdutf8 = { version = "0.1", features = ["std", "aarch64_neon"] }
2424
rayon = "1.5"
2525
flate2 = "1.0"
26-
rayon-seq-iter = { git = "https://github.com/nwtgck/rayon-seq-iter" }
26+
# rayon-seq-iter = { git = "https://github.com/nwtgck/rayon-seq-iter" }
2727
bzip2-rs = { git = "https://github.com/paolobarbolini/bzip2-rs", features = ["rayon", "nightly"]}
2828

2929
ouroboros = "0.14.2"

README.md

Lines changed: 58 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,9 @@
22

33
Betfair Data is a very fast Betfair historical data file parsing library for python. It currently supports tar archives containing BZ2 compressed NLJSON files (the standard format provided by [Betfair's historic data portal](https://historicdata.betfair.com/#/home)).
44

5-
The library is written in Rust and uses advanced performance enhancing techniques, like in place json deserialization and decompressing Bz2 encoded data on worker threads and is ideal for parsing large quantities of historic data that could otherwise take hours or days to parse.
5+
The library is written in Rust and uses advanced performance enhancing techniques, like in place json deserialization and decompressing Bz2/Gzip encoded data on worker threads and is ideal for parsing large quantities of historic data that could otherwise take hours or days to parse.
6+
7+
This library is a work in progress and is still subject to breaking changes.
68

79
## Installation
810

@@ -38,80 +40,76 @@ for market in betfair_data.TarBz2(paths).mutable():
3840
print(f"Markets {market_count} Updates {update_count}")
3941

4042
```
41-
## Types
42-
IDE's should automatically detect the types and provide checking and auto complete. See the [pyi stub file](betfair_data.pyi) for a comprehensive view of the types and method available.
43-
44-
<br />
45-
46-
## Benchmarks
47-
48-
| Betfair Data (this) | [Betfairlightweight](https://github.com/liampauling/betfair/) |
49-
| ---------------------|---------------------|
50-
| 3m 37sec | 1hour 1min 45sec |
51-
| ~101 markets/sec | ~6 markets/sec |
52-
| ~768,000 updates/sec | ~45,500 updates/sec |
53-
54-
Benchmarks were run against 3 months of Australian racing markets comprising roughly 22,000 markets. Benchmarks were run on a M1 Macbook Pro with 32GB ram.
5543

56-
These results should only be used as a rough comparison, different machines, different sports and even different months can effect the performance and overall markets/updates per second.
44+
## Loading Files
5745

58-
No disrespect is intended towards betfairlightweight, which remains an amazing library and a top choice for working with the Betfair API. Every effort was made to have its benchmark below run as fast as possible, and any improvements are welcome.
59-
60-
<br>
46+
You can read in self recorded stream files. Make sure to set cumulative_runner_tv to False for self recorded files to make sure you get the correct runner and market volumes.
47+
```python
48+
import betfair_data
49+
import glob
6150

62-
Betfair_Data benchmark show in the example above.
63-
<details><summary>Betfairlightweight Benchmark</summary>
51+
paths = glob.glob("data/*.gz")
52+
files = betfair_data.Files(paths, cumulative_runner_tv=False)
53+
```
54+
Or you can read official Betfair Tar archives with bz2 encoded market files.
6455

6556
```python
66-
from typing import Sequence
67-
68-
import unittest.mock
69-
import tarfile
70-
import bz2
71-
import betfairlightweight
57+
import betfair_data
58+
import glob
7259

73-
trading = betfairlightweight.APIClient("username", "password", "appkey")
74-
listener = betfairlightweight.StreamListener(
75-
max_latency=None, lightweight=True, update_clk=False, output_queue=None, cumulative_runner_tv=True, calculate_market_tv=True
76-
)
60+
paths = glob.glob("data/*.tar")
61+
files = betfair_data.TarBz2(paths, cumulative_runner_tv=True)
62+
```
7763

78-
paths = [
79-
"data/2021_10_OctRacingAUPro.tar",
80-
"data/2021_11_NovRacingAUPro.tar",
81-
"data/2021_12_DecRacingAUPro.tar"
82-
]
64+
Or load the file through any other means and pass the bytes and name into the object constructors.
8365

84-
def load_tar(file_paths: Sequence[str]):
85-
for file_path in file_paths:
86-
with tarfile.TarFile(file_path) as archive:
87-
for file in archive:
88-
yield bz2.open(archive.extractfile(file))
89-
return None
66+
```python
67+
# generator to read in files
68+
def load_files(paths: str):
69+
for path in glob.glob(paths, recursive=True):
70+
with open(path, "rb") as file:
71+
yield (path, file.read())
72+
73+
# iterate over the files and convert into bflw iterator
74+
for name, bs in load_files("markets/*.json"):
75+
for market_books in bflw.BflwIter(name, bs):
76+
for market_book in market_books:
77+
# do stuff
78+
pass
79+
```
9080

91-
market_count = 0
92-
update_count = 0
81+
## Object Types
9382

94-
for file_obj in load_tar(paths):
95-
with unittest.mock.patch("builtins.open", lambda f, _: f):
96-
stream = trading.streaming.create_historical_generator_stream(
97-
file_path=file_obj,
98-
listener=listener,
99-
)
100-
gen = stream.get_generator()
83+
You can use differnt styles of objects, with pros or depending on your needs
10184

102-
market_count += 1
103-
for market_books in gen():
104-
for market_book in market_books:
105-
update_count += 1
85+
Mutable objects, generally the fastest, but can be hard to use. If you find yourself calling market.copy a lot, you may find immutable faster
86+
``` python
87+
# where files is loaded from a TarBz2 or Files source like above
88+
mut_iter = files.mutable()
89+
for market in mut_iter: # different markets per file
90+
while market.update(): # update the market in place
91+
pass
92+
```
10693

107-
print(f"Markets {market_count} Updates {update_count}", end='\r')
108-
print(f"Markets {market_count} Updates {update_count}")
94+
Immutable objects, slightly slower but can be easier to use. Equilivent of calling market.copy() on every update but faster, as only objects that change make new copies. ```NOT YET FINISHED```
95+
``` python
96+
immut_iter = files.immutable()
97+
for market_iter in immut_iter: # different files
98+
for market in market_iter: # each update of a market/file
99+
pass
100+
```
109101

102+
Betfairlightweight compatible version, drop in replacement for bflw objects.
103+
```python
104+
bflw_iter = files.bflw()
105+
for file in bflw_iter: # different files
106+
for market_books in file: # different books per update
107+
for market in market_books: # each update of a market
108+
pass
110109
```
111-
</details>
112110

113-
<br>
114-
<br>
111+
## Types
112+
IDE's should automatically detect the types and provide checking and auto complete. See the [pyi stub file](betfair_data.pyi) for a comprehensive view of the types and method available.
115113

116114

117115
## Logging

betfair_data/betfair_data.abi3.so

-16 Bytes
Binary file not shown.

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ build-backend = "maturin"
55
[project]
66
name = "betfair_data"
77
requires-python = ">=3.6"
8-
version = "0.2.0"
8+
version = "0.2.1"
99
description = "Fast Python Betfair historic data file parser"
1010
authors = [ {name = "Robert Tarbath", email = "rtarbath@gmail.com"} ]
1111
license = {file = "LICENSE"}

src/bflw/mod.rs

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@ pub mod market_definition;
55
pub mod market_definition_runner;
66
pub mod runner_book;
77
mod float_str;
8-
mod runner_book_ex;
98
mod runner_book_sp;
109

1110

src/bflw/runner_book.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ use serde::{Deserialize, Deserializer};
66
use serde_json::value::RawValue;
77

88
use super::market_definition_runner::MarketDefRunnerUpdate;
9-
use super::runner_book_ex::{RunnerBookEX, RunnerBookEXUpdate};
9+
use crate::immutable::runner_book_ex::{RunnerBookEX, RunnerBookEXUpdate};
1010
use super::runner_book_sp::{RunnerBookSP, RunnerBookSPUpdate};
1111
use crate::bflw::float_str::FloatStr;
1212
use crate::bflw::RoundToCents;

src/bflw/runner_book_ex.rs

Lines changed: 0 additions & 56 deletions
This file was deleted.

src/immutable/datetime.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ impl PartialEq<DateTimeString> for &str {
8080
}
8181

8282

83-
#[derive(Clone, Copy, Debug)]
83+
#[derive(Clone, Copy, Debug, Default)]
8484
pub struct DateTime(u64);
8585

8686
impl DateTime {

0 commit comments

Comments
 (0)