Skip to content

Log classification using hybrid classification framework. It combines Regular Expression (Regex), Sentence Transformer, LLM (Large Language Models)

Notifications You must be signed in to change notification settings

Pranjal360Agarwal/LogEngineX

Repository files navigation

LogEngineX

This project implements a hybrid log classification system, combining three complementary approaches to handle varying levels of complexity in log patterns. The classification methods ensure flexibility and effectiveness in processing predictable, complex, and poorly-labeled data patterns.


Classification Approaches

  1. Regular Expression (Regex):

    • Handles the most simplified and predictable patterns.
    • Useful for patterns that are easily captured using predefined rules.
  2. Sentence Transformer + Logistic Regression:

    • Manages complex patterns when there is sufficient training data.
    • Utilizes embeddings generated by Sentence Transformers and applies Logistic Regression as the classification layer.
  3. LLM (Large Language Models):

    • Used for handling complex patterns when sufficient labeled training data is not available.
    • Provides a fallback or complementary approach to the other methods.

architecture


Folder Structure

  1. training/:

    • Contains the code for training models using Sentence Transformer and Logistic Regression.
    • Includes the code for regex-based classification.
  2. models/:

    • Stores the saved models, including Sentence Transformer embeddings and the Logistic Regression model.
  3. resources/:

    • This folder contains resource files such as test CSV files, output files, images, etc.
  4. Root Directory:

    • Contains the FastAPI server code (server.py).

Setup Instructions

  1. Install Dependencies: Make sure you have Python installed on your system. Install the required Python libraries by running the following command:

    pip install -r requirements.txt
  2. Run the FastAPI Server: To start the server, use the following command:

    uvicorn server:app --reload

    Once the server is running, you can access the API at:

    • http://127.0.0.1:8000/ (Main endpoint)
    • http://127.0.0.1:8000/docs (Interactive Swagger documentation)
    • http://127.0.0.1:8000/redoc (Alternative API documentation)

Usage

Upload a CSV file containing logs to the FastAPI endpoint for classification. Ensure the file has the following columns:

  • source
  • log_message

The output will be a CSV file with an additional column target_label, which represents the classified label for each log entry.


Contact

If you have any questions or feedback, please feel free to contact me at pranjal360agarwal@gmail.com. You can also connect with me on LinkedIn or Twitter. Thank you for visiting my project!

Made with ❤ by Pranjal Agarwal.

About

Log classification using hybrid classification framework. It combines Regular Expression (Regex), Sentence Transformer, LLM (Large Language Models)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published