Skip to content

Added full Markov-model pipeline #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 27, 2025
Merged

Conversation

PhilippSchmelter
Copy link
Collaborator

Closes #10

This pull request introduces a comprehensive implementation of a Markov-model pipeline for analyzing time-series data, including preprocessing, transition matrix generation, and testing. The most significant changes include adding the Markov-model pipeline to the project, implementing bucket-based time mapping, and updating the loader and tests to integrate the new functionality.

Markov-model pipeline implementation:

  • Added the full Markov-model pipeline, including transition count and probability matrix generation. (CHANGELOG.md, CHANGELOG.mdR11)
  • Created src/markov/transition_counts.py and src/markov/transitions.py for generating raw transition counts and Laplace-smoothed transition probabilities, respectively. [1] [2]

Time-bucket mapping:

  • Implemented a bucket-based time mapping system in src/markov/buckets.py, which maps timestamps to unique bucket IDs based on month, weekend flag, and quarter-hour intervals. (NUM_BUCKETS constant and bucket_id function)
  • Integrated bucket assignment into the preprocessing pipeline by modifying the load_timeseries function to add a "bucket" column. [1] [2]

Refactoring and modularization:

  • Refactored and modularized the Markov-related functionality by creating a dedicated src/markov package with an __init__.py file for better organization and reusability.
  • Added _core.py to encapsulate shared Markov-model logic, such as the _transition_counts function.

Testing and validation:

  • Updated tests/test_loader.py to validate the integration of the Markov-model pipeline, ensuring the loader correctly adds "state" and "bucket" columns and computes expected values.
  • Enhanced the test coverage to include validation of all new columns introduced by the preprocessing pipeline, such as "scaled," "state," and "bucket."

Miscellaneous:

  • Removed unused code and cleaned up src/main.py and src/markov/model.py. [1] [2]

@PhilippSchmelter PhilippSchmelter self-assigned this Jun 3, 2025
@PhilippSchmelter PhilippSchmelter added the enhancement New feature or request label Jun 3, 2025
@PhilippSchmelter PhilippSchmelter merged commit 4b79f06 into main Jun 27, 2025
2 checks passed
@PhilippSchmelter PhilippSchmelter deleted the ps/#10-fullMarkov branch June 27, 2025 23:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implementation of full Markov-model
1 participant