Skip to content

mutalyzer/algebra

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mutalyzer-algebra

This library enables efficient handling of genomic variants based on all minimal LCS alignments.

Publications

The concept of binary relations between variants is introduced in:

Jonathan K. Vis, Mark A. Santcroos, Walter A. Kosters and Jeroen F.J. Laros. "A Boolean Algebra for Genetic Variants." Bioinformatics 39.1 (2023).

The process of selecting a proper variant description (variant extraction) is presented in:

Mark A. Santcroos, Walter A. Kosters, Mihai Lefter, Jeroen F.J. Laros, Jonathan K. Vis. "A Graph-based Approach to Variant Extraction." arXiv (2025).

Installation

Use pip to install from the Python Package Index (PyPI).

python -m pip install mutalyzer-algebra

Or directly from GitHub for development (after cloning in an active virtual environment).

python -m pip install --upgrade --editable .[dev]

Testing

Run the tests.

python -m coverage run

Usage

Use the command-line interface.

algebra --reference "AAAAA" compare --lhs-hgvs "1_2insTA" --rhs-hgvs "2_3insT"

Or as a Python package.

from algebra import compare
from algebra.variants import parse_hgvs


reference = "AAAAA"
lhs = parse_hgvs("1_2insTA")
rhs = parse_hgvs("2_3insT")

# returns: Relation.DISJOINT
compare(reference, lhs, rhs)


reference = "CATATATC"
lhs = parse_hgvs("2_7AT[4]")  # observed: CATATATATC
rhs = parse_hgvs("5_6insT")   # observed: CATATTATC

# returns: Relation.CONTAINS
compare(reference, lhs, rhs)

Extracting variants from sequences.

from algebra.extractor import extract_sequence, to_hgvs


reference = "CATATATC"
observed = "CATATATATC"

canonical, _ = extract_sequence(reference, observed)
# returns: 2_7AT[4]
to_hgvs(canonical, reference)

Variant normalization.

from algebra.extractor import extract, to_hgvs
from algebra.variants import parse_hgvs


reference = "CATATATC"
variant = parse_hgvs("6_7dupAT")

canonical, _ = extract(reference, variant)
# returns: 2_7AT[4]
to_hgvs(canonical, reference)

Graph-based variant representation.

from algebra import LCSgraph
from algebra.utils import to_dot


reference = "ACCTGACT"
observed = "ATCTTACTT"

graph = LCSgraph.from_sequence(reference, observed)
# vizualized using Graphviz
"\n".join(to_dot(reference, graph))

Example LCS-graph

See Also

A web interface with integration with Mutalyzer: Mutalyzer Algebra

Mutalyzer Algebra on PyPI