Skip to content

Conversation

dywsy21
Copy link

@dywsy21 dywsy21 commented Aug 4, 2025

What

Description

I saw #18 and was interested in how the model would behave in ARC-AGI if it only used puzzle inputs/outputs from train instead of also incorporating the inputs from test.

While I know that TTT is allowed in ARC-AGI, training on test examples beforehand does allow the model to have an unfair understanding of the implied rules used in them. It would be interesting to see how the H&L arch could figure out the implied rules it has not seen before, just like humans.

By removing TTT your model's evaluation result on ARC-AGI can be more convincing and more indicative of the model's actual generalization abilities. Let me know if this approach will help, happy to chat~

@helma436
Copy link

helma436 commented Aug 4, 2025

.

@shawntan
Copy link

Does the TTT setting for ARC-AGI allow for parameter updates across evaluation examples?

If it doesn't then doing Training + TTT together represents a very different setting than Training -> TTT per evaluation instance right? Each evaluation instance would be iid in that case, and the model cannot use generalised information from the evaluation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants