New project: care-tuning #2
Closed
minimalparts
started this conversation in
Ideas
Replies: 1 comment
-
I am closing this as the project evolved a lot and drifted from the original plan. In particular, we now have code that will run on a single-board computer and allow someone to train the model completely from scratch. This is however developed in a separate repo which will be made public soon. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
New project: care-tuning.
We are launching a new project which is scheduled to run for the rest of 2024. The aim is to highlight (and repair) core deficiencies in LLMs, focusing on two related aspects of training: a) the choice and presentation of data; b) the explicit induction of useful linguistic biases in the learner. Both points have correlates in the language acquisition literature, which shows that so-called 'motherese' (i.e. the language spoken by a carer addressing a child) exhibits particular features that facilitate language learning. We ask whether 'training with care', on smaller but curated data, can improve the linguistic abilities of artificial language models.
We will be taking GPT2 as a base model to fine-tune. Our main contribution will be the transformation of raw data into semantically structured input, meant to explicitly teach the model essential features of human language. All training will be performed on a home computer, with no GPU involved. Stay tuned for more. The dedicated repository is here: https://github.com/possible-worlds-research/care-tuning.
Beta Was this translation helpful? Give feedback.
All reactions