Skip to content

PredPatt Integration and Python 3.12+ Modernization #31

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 30 commits into from
Jul 31, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
e5bd6e4
Migrates package to Pythn 3.12 and configures new build tools. Introd…
aaronstevenwhite Jul 26, 2025
671450d
Adds comprehensive test suite for PredPatt functionality and updates …
aaronstevenwhite Jul 28, 2025
10b1348
Refactors project structure by removing obsolete test files and updat…
aaronstevenwhite Jul 28, 2025
22e90f0
Refactors import statements across multiple modules for improved orga…
aaronstevenwhite Jul 28, 2025
dcf03df
Refactors type hints and error handling across multiple modules to im…
aaronstevenwhite Jul 29, 2025
f2901ad
Enhances the PredPatt module by expanding documentation and type hint…
aaronstevenwhite Jul 29, 2025
ff71ae8
Refactors type hints and method signatures across the corpus, graph, …
aaronstevenwhite Jul 29, 2025
86035e6
Updates documentation in README and tutorial files for clarity on dat…
aaronstevenwhite Jul 29, 2025
70e019f
Enhances the `__init__.py` file with a comprehensive module descripti…
aaronstevenwhite Jul 29, 2025
0ded3cc
Enhances documentation across UDS modules, including detailed descrip…
aaronstevenwhite Jul 29, 2025
9d020ed
Enhances the UDSCorpus class by refining type hints for the sentences…
aaronstevenwhite Jul 29, 2025
b1f1fa9
Refines documentation and type hints in the UDS annotation module. En…
aaronstevenwhite Jul 29, 2025
ed3a9ac
Refines type hints and documentation in the UDS corpus, document, and…
aaronstevenwhite Jul 29, 2025
7b1951e
Refactors mypy configuration to ignore errors in test and documentati…
aaronstevenwhite Jul 29, 2025
77f5648
Enhances the UDS visualization module by adding comprehensive docstri…
aaronstevenwhite Jul 29, 2025
7ad9b38
Refactors the PredPatt module by restructuring the `__init__.py`, `co…
aaronstevenwhite Jul 30, 2025
e002682
Enhances type definitions and documentation in the PredPatt typing mo…
aaronstevenwhite Jul 30, 2025
196e64c
Refines documentation for the Token class in the PredPatt module. Upd…
aaronstevenwhite Jul 30, 2025
3ba5f8b
Refactors the Predicate class in the PredPatt module to introduce a n…
aaronstevenwhite Jul 30, 2025
7b567f2
Refactors documentation across the PredPatt module to enhance clarity…
aaronstevenwhite Jul 30, 2025
040be2d
Refactors comments and documentation across the PredPatt filters and …
aaronstevenwhite Jul 30, 2025
c3e52e2
Enhances documentation across multiple modules by refining docstrings…
aaronstevenwhite Jul 30, 2025
8821db8
Refactors method signatures across UDS modules for improved readabili…
aaronstevenwhite Jul 30, 2025
0833509
Enhances documentation across various modules, including `rdf.py`, `e…
aaronstevenwhite Jul 30, 2025
527c111
Add CHANGELOG and CI workflow; update README and documentation
aaronstevenwhite Jul 30, 2025
131ef71
Update Dockerfile, requirements, and documentation for Jupyter Lab in…
aaronstevenwhite Jul 30, 2025
a1236f2
Enhance installation instructions and documentation for Decomp
aaronstevenwhite Jul 30, 2025
568cb89
Fixes CI errors.
aaronstevenwhite Jul 31, 2025
ab784dd
Updates license year.
aaronstevenwhite Jul 31, 2025
d124772
Refactor sDockerfile and update installation instructions
aaronstevenwhite Jul 31, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
131 changes: 131 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
name: CI

on:
push:
branches: [ master, main ]
pull_request:
branches: [ master, main ]

jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.12"]

steps:
- uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Cache pip packages
uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt', '**/pyproject.toml') }}
restore-keys: |
${{ runner.os }}-pip-

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"

- name: Run tests with pytest (including slow tests)
run: |
pytest --runslow -v

lint:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4

- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Cache pip packages
uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt', '**/pyproject.toml') }}
restore-keys: |
${{ runner.os }}-pip-

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"

- name: Run ruff
run: |
# Check only for errors (E) and critical failures (F), not style warnings
# Exclude tests directory from linting
ruff check . --select E,F --exclude tests/
# Format check is optional - only fail on critical issues
ruff format --check . || true

type-check:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4

- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Cache pip packages
uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt', '**/pyproject.toml') }}
restore-keys: |
${{ runner.os }}-pip-

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"

- name: Run mypy
run: |
mypy decomp

docs:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4

- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Cache pip packages
uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt', '**/pyproject.toml', '**/docs/requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-

- name: Install dependencies
run: |
python -m pip install --upgrade pip
# Install the package first
pip install -e .
# Then install documentation dependencies
pip install -r docs/requirements.txt

- name: Build documentation
run: |
cd docs
make html
140 changes: 140 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
# Changelog

All notable changes to the Decomp project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.3.0] - 2025-07-30

### Added
- **New PredPatt Integration**: Complete integration of PredPatt semantic role labeling functionality into `decomp.semantics.predpatt` module
- **Modern Python Support**: Full Python 3.12+ compatibility with updated type hints using union syntax (`|`) and built-in generics
- **Modern Packaging**: Migration from `setup.py` to `pyproject.toml` with modern build system

### Changed
- **Type System Modernization**: All type hints updated to Python 3.12+ conventions using `|` union syntax and built-in generics
- **Documentation**: Comprehensive documentation overhaul with detailed API references and usage examples
- **Code Quality**: Implementation of ruff and mypy for consistent code formatting and static type checking
- **Test Suite**: Complete pytest-based test suite with differential testing against original PredPatt implementation

### Technical Details
- **Module Structure**: New modular architecture with `core`, `extraction`, `parsing`, `rules`, `filters`, and `utils` submodules
- **Algorithm Fidelity**: Byte-for-byte identical output compatibility with original PredPatt implementation
- **Dependencies**: Updated to modern versions while maintaining backward compatibility

## [0.2.2] - 2022-06-08

### Fixed
- **Corpus Loading**: Fixed broken corpus load from JSON functionality
- **UDS Annotations**: Corrected error in raw UDS-EventStructure annotations processing

### Notes
- Final release of v0.2.x series before major modernization
- Maintained compatibility with Universal Decompositional Semantics v2.0 dataset

## [0.2.1] - 2021-04-05

### Fixed
- **Python 3.9 Compatibility**: Resolved compatibility issues with Python 3.9
- **Dependency Updates**: Updated dependencies to support newer Python versions

### Notes
- Part of Universal Decompositional Semantics v2.0 release series
- Improved cross-platform compatibility

## [0.2.0] - 2021-03-19

### Added
- **Universal Decompositional Semantics v2.0**: First release supporting UDS 2.0 dataset
- **Document-Level Graphs**: Support for document-level semantic graph structures
- **Raw Annotations**: Access to raw annotation data alongside normalized annotations
- **Advanced Metadata**: Enhanced metadata handling and processing capabilities
- **Visualization Module**: New `decomp.vis` module for graph visualization and analysis
- **Enhanced Graph Support**: Improved NetworkX and RDF graph representations

### Changed
- **Major Version Bump**: Significant architectural changes to support UDS v2.0
- **API Enhancements**: Extended API surface for document-level processing
- **Data Format**: Support for both sentence-level and document-level annotation formats

### Technical Details
- **Graph Structures**: Support for complex document-level semantic relationships
- **Annotation Pipeline**: Enhanced pipeline for processing raw and normalized annotations
- **Metadata Schema**: Advanced metadata schema for annotation provenance and confidence

## [0.1.3] - 2020-03-13

### Fixed
- **RDF Cache**: Fixed RDF cache clearing error that could cause memory issues
- **Document Attributes**: Added missing document and sentence ID attributes for better tracking

### Added
- **Improved Tracking**: Better document and sentence identification in corpus processing

### Notes
- Maintenance release improving stability and debugging capabilities
- Enhanced corpus navigation and identification features

## [0.1.2] - 2020-01-17

### Fixed
- **Corpus Construction**: Fixed corpus construction error when using split parameter
- **Data Splitting**: Resolved issues with train/dev/test split functionality

### Technical Details
- **Split Parameters**: Corrected handling of data split parameters in corpus initialization
- **Error Handling**: Improved error messages for corpus construction failures

## [0.1.1] - 2019-10-19

### Fixed
- **Genericity Annotations**: Fixed copular clause argument linking error in genericity annotations
- **Argument Linking**: Corrected semantic role assignment for copular constructions

### Technical Details
- **Linguistic Accuracy**: Improved handling of copular clause structures in semantic annotation
- **Annotation Quality**: Enhanced accuracy of genericity property assignments

## [0.1.0] - 2019-10-01

### Added
- **Initial Release**: First major release of the Decomp toolkit
- **Universal Decompositional Semantics v1.0**: Complete support for UDS v1.0 dataset
- **Core Framework**: Foundation classes for semantic graph processing
- **Syntax Integration**: Universal Dependencies syntax integration
- **Semantic Properties**: Support for multiple semantic annotation types:
- Genericity annotations
- Factuality annotations
- Protorole annotations
- Temporal annotations
- Word sense annotations
- **Graph Representations**: NetworkX and RDF graph format support
- **Corpus Management**: Tools for loading, processing, and managing UDS corpora
- **Documentation**: Comprehensive documentation and API reference

### Technical Foundation
- **Graph Infrastructure**: Core graph processing and manipulation capabilities
- **Annotation Framework**: Flexible annotation loading and processing system
- **Type System**: Initial type definitions for semantic structures
- **Testing Framework**: Basic test suite for core functionality

---

## Release Notes

### Dataset Compatibility
- **v0.1.x**: Universal Decompositional Semantics v1.0
- **v0.2.x**: Universal Decompositional Semantics v2.0
- **v0.3.x**: Universal Decompositional Semantics v2.0 + PredPatt integration

### Python Version Support
- **v0.1.x - v0.2.x**: Python 3.6+
- **v0.3.x**: Python 3.12+ (modern type hints and language features)

### Breaking Changes
- **v0.2.0**: API changes for document-level graph support
- **v0.3.0**: Modernized type system, requires Python 3.12+, integrated PredPatt functionality

For detailed technical documentation, see the [Decomp Documentation](https://decomp.readthedocs.io/en/latest/).
For issues and support, visit the [GitHub Repository](https://github.com/decompositional-semantics-initiative/decomp).
19 changes: 13 additions & 6 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,9 +1,16 @@
FROM python:3.6
FROM quay.io/jupyter/datascience-notebook:2024-11-19

WORKDIR /usr/src/decomp

COPY . .
# set working directory
WORKDIR "${HOME}/decomp"

RUN pip install --no-cache-dir -r requirements.txt && \
pip install --no-cache-dir . && \
python -c "from decomp import UDSCorpus; UDSCorpus()"
# copy the package files
COPY --chown=${NB_UID}:${NB_GID} . .

# install the package and its dependencies
RUN pip install --no-cache-dir -e ".[viz]" && \
# pre-build the UDS corpus to cache it in the image
python -c "from decomp import UDSCorpus; UDSCorpus()"

# set the default command to start Jupyter Lab
CMD ["start-notebook.py", "--IdentityProvider.token=''", "--IdentityProvider.password=''"]
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2020 Aaron Steven White
Copyright (c) 2025 Aaron Steven White

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
7 changes: 0 additions & 7 deletions MANIFEST.in

This file was deleted.

Loading