Releases · whythawk/whyqd

10 Aug 09:16

turukawa

1.0.8

2bea847

1.0.8

CategoryModel terms now use StrictBool to avoid Pydantic's liberal interpretation of booleans.

Assets 2

09 Aug 16:13

turukawa

1.0.7

dbb95a0

1.0.7

Minor fix, but crucial for continuing ambiguity issue with category terms. Any of source or destination fields or terms can share a name, which causes painful and tedious issues. Hopefully this fix resolves it.

Assets 2

09 Aug 11:49

turukawa

1.0.5

a1c8cd6

1.0.5

Additional ambiguity checks for category term edge case where source or destination fields can share category names.

Assets 2

08 Aug 08:48

turukawa

1.0.4

ad369ae

1.0.4

Dependency updates.
Permitting nrow limit on Parquet files.
Disambiguation where schema subject and object field categories have the same name.
Ambiguity checks for string blank space. If source data includes, should not be removed to preserve original structure.

Assets 2

05 Jul 15:37

turukawa

1.0.1

e231a0d

1.0.1

Minor convenience update.

Permitting nrow limit on Parquet files.

Assets 2

10 May 14:15

turukawa

1.0.0

96b77c8

1.0.0

This version shares some features with the previous version, but is a complete refactoring and conceptual redesign. It is
not backwardly compatible. Future versions will maintain compatability with this one.

Separated data models from schema models so that crosswalks are schema-to-schema.
Complete revision of the API into four discrete Definition classes, SchemaDefinition, DataSourceDefinition,
CrosswalkDefinition and TransformDefinition.
Removed filters and actions that are no longer relevant (including REBASE, and MERGE).
Simplified CATEGORISE since it no longer requires deriving terms as part of the crosswalk.
Crosswalks are designed to support continuous integration.
Pydantic models are more transparent via each Definition's .get property.
Refactored Pandas to support Modin and Ray for data >1 million rows.
Mime type support for data sources in Parquet and Feather.
Rewrote documentation in MKDocs from Sphinx.
Revised all tutorials and documentation.

Assets 2

23 Aug 17:31

turukawa

v0.5.0

8b0b02e

whyqd: simplicity, transparency, speed

whyqd provides an intuitive method for restructuring messy data to conform to a standardised metadata schema. It supports data managers and researchers looking to rapidly, and continuously, normalise any messy spreadsheets using a simple series of steps. Once complete, you can import wrangled data into more complex analytical systems or full-feature wrangling tools.

Read the docs and there are two worked tutorials to demonstrate
how you can use whyqd to support source data curation transparency:

Install using pip:

pip install whyqd

Version 0.5.0 introduced a new, simplified, API, along with script-based transformation actions. You can import and
transform any saved method.json files with:

SCHEMA = whyqd.Schema(source=SCHEMA_SOURCE)
schema_scripts = whyqd.parsers.LegacyScript().parse_legacy_method(
            version="1", schema=SCHEMA, source_path=METHOD_SOURCE_V1
        )

Where SCHEMA_SOURCE is a path to your schema. Existing schema.json files should still work.

Assets 2

02 Nov 15:08

turukawa

v0.3.1

1f96e8c

whyqd: simplicity, transparency, speed

whyqd provides an intuitive method for restructuring messy data to conform to a standardised metadata schema. It supports data managers and researchers looking to rapidly, and continuously, normalise any messy spreadsheets using a simple series of steps. Once complete, you can import wrangled data into more complex analytical systems or full-feature wrangling tools.

Read the docs and a full tutorial.

Install using pip:

pip install whyqd

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: whythawk/whyqd

1.0.8

Uh oh!

1.0.7

Uh oh!

1.0.5

Uh oh!

1.0.4

Uh oh!

1.0.1

Uh oh!

1.0.0

Uh oh!

whyqd: simplicity, transparency, speed

Uh oh!

whyqd: simplicity, transparency, speed

Uh oh!