Skip to content

This project showcases how object-oriented principles like hierarchy, polymorphism, and encapsulation can improve data handling and processing. Python, SQLite, and Poetry for dependency management, the project achieves modularity and flexibility, making it easy to extend and maintain.

Notifications You must be signed in to change notification settings

caio-moliveira/read-files-to-dataframe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Introduction to Hierarchy, Polymorphism, and Encapsulation

This project leverages key object-oriented principles:

  • Hierarchy: Classes are organized in a structured hierarchy, where AbstractDataSource serves as a base class. Specific data source types (e.g., CsvSource) inherit from this base, allowing shared behaviors while enabling specific implementations.

  • Polymorphism: Polymorphism allows each data source type to implement its unique data handling methods while sharing a common interface defined in AbstractDataSource. This enables the project to process different data sources interchangeably.

  • Encapsulation: Each class encapsulates its data and methods, exposing only what's necessary. This keeps data handling logic modular, with each class responsible for managing its own state and operations, enhancing maintainability and flexibility.

Components

  1. AbstractDataSource: This is an abstract base class that defines the structure for data source handling. Other data source classes inherit from this base class and implement specific functionalities.
  2. CsvSource: This class handles CSV data, allowing you to read, parse, and process CSV files.
  3. FilesSources: Manages file-related operations, including handling multiple file sources.

How to Use

  1. Set Up a CSV Data Source: Use the CsvSource class to load and manage data from a CSV file. Instantiate the CsvSource with the path to your CSV file to begin working with its contents.

    from CsvSource import CsvSource
    
    csv_data = CsvSource("path/to/your/file.csv")
    data = csv_data.read_data()

This example will load the specified CSV file, allowing further processing.

Working with Multiple File Sources

Use the FilesSources class to manage and read multiple file sources, centralizing data processing from various files.

from FilesSources import FilesSources

files = FilesSources(["file1.csv", "file2.csv"])
combined_data = files.aggregate_data()

Here, aggregate_data processes multiple files and combines their content based on your requirements.

Prerequisites

Ensure you have the following:

  • Python 3.12
  • SQLite (usually comes pre-installed with Python)
  • Poetry (for installing dependencies)

Setup and Installation

1. Clone the Repository

To start, clone this repository to your local machine:

git clone https://github.com/yourusername/your-repo-name.git
cd your-repo-name
  1. Install Dependencies Install the required dependencies listed in pyproject.toml:
pip install poetry
poetry init
poetry install
poetry shell
  1. Main.py

The main entry point for the project is in the main.py file. Run this script to execute the project’s primary functions, leveraging the classes and their methods to perform data operations.

Running the Project To start the application, simply run the main.py script:

python main.py

Customization

You can extend or customize this project by:

  • Adding New Data Source Types: Create new classes that inherit from AbstractDataSource and implement the required methods.
  • Modifying Aggregation Logic: Adjust or expand the logic in FilesSources to handle data differently according to your needs.

About

This project showcases how object-oriented principles like hierarchy, polymorphism, and encapsulation can improve data handling and processing. Python, SQLite, and Poetry for dependency management, the project achieves modularity and flexibility, making it easy to extend and maintain.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages