Skip to content

yehonatanke/genomic_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Genomic Analysis

A DNA sequence analysis for sequence characteristics, GC content, and motif structures.

Aspects of DNA sequence analysis:

  • GC Content Analysis
  • Motif Identification: Identifying motifs for understanding gene regulation and protein-DNA interactions.
  • k-mer Frequency Analysis: k-mers are subsequences of length k contained within a biological sequence.
  • Sequence Clustering: Groups DNA sequences based on their similarity. We use GC content and sequence length as features for clustering.
  • Variation Analysis
  • Structural Variation Detection: Identify larger rearrangements.

Core Functionalities

Sequence Analysis

  • DNA sequence analysis and manipulation
  • GC content analysis
  • Motif identification and analysis
  • K-mer frequency analysis
  • Sequence clustering

Machine Learning Applications

  • Sequence Classification

    • Feature extraction from DNA sequences
    • GC/AT content analysis
    • K-mer frequency analysis
    • Random Forest-based classification
  • Motif Prediction

    • Sliding window approach
    • Feature extraction from sequence windows
    • Position-based motif prediction
    • Random Forest-based prediction

About

Computational toolkit for DNA sequence analysis.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages