This project leverages key object-oriented principles:
-
Hierarchy: Classes are organized in a structured hierarchy, where
AbstractDataSource
serves as a base class. Specific data source types (e.g.,CsvSource
) inherit from this base, allowing shared behaviors while enabling specific implementations. -
Polymorphism: Polymorphism allows each data source type to implement its unique data handling methods while sharing a common interface defined in
AbstractDataSource
. This enables the project to process different data sources interchangeably. -
Encapsulation: Each class encapsulates its data and methods, exposing only what's necessary. This keeps data handling logic modular, with each class responsible for managing its own state and operations, enhancing maintainability and flexibility.
- AbstractDataSource: This is an abstract base class that defines the structure for data source handling. Other data source classes inherit from this base class and implement specific functionalities.
- CsvSource: This class handles CSV data, allowing you to read, parse, and process CSV files.
- FilesSources: Manages file-related operations, including handling multiple file sources.
-
Set Up a CSV Data Source: Use the
CsvSource
class to load and manage data from a CSV file. Instantiate theCsvSource
with the path to your CSV file to begin working with its contents.from CsvSource import CsvSource csv_data = CsvSource("path/to/your/file.csv") data = csv_data.read_data()
This example will load the specified CSV file, allowing further processing.
Use the FilesSources
class to manage and read multiple file sources, centralizing data processing from various files.
from FilesSources import FilesSources
files = FilesSources(["file1.csv", "file2.csv"])
combined_data = files.aggregate_data()
Here, aggregate_data
processes multiple files and combines their content based on your requirements.
Ensure you have the following:
- Python 3.12
- SQLite (usually comes pre-installed with Python)
- Poetry (for installing dependencies)
To start, clone this repository to your local machine:
git clone https://github.com/yourusername/your-repo-name.git
cd your-repo-name
- Install Dependencies Install the required dependencies listed in pyproject.toml:
pip install poetry
poetry init
poetry install
poetry shell
- Main.py
The main entry point for the project is in the main.py file. Run this script to execute the project’s primary functions, leveraging the classes and their methods to perform data operations.
Running the Project To start the application, simply run the main.py script:
python main.py
You can extend or customize this project by:
- Adding New Data Source Types: Create new classes that inherit from
AbstractDataSource
and implement the required methods. - Modifying Aggregation Logic: Adjust or expand the logic in
FilesSources
to handle data differently according to your needs.