-
Notifications
You must be signed in to change notification settings - Fork 2
Multiple inputs & nearest neighbors #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything seems good. There's just one test I think should be changed, and a comment on the unnecessary complexity of the query_material_property
method.
density=4.7 | ||
prop="band_gap" | ||
|
||
output = exact_match.query_material_property(formula, prop, one_only=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has the potential to fail in future as new materials are added to materials project.
Maybe assert len(output) >= 9 is a better test
docs = [doc for doc in docs if doc.symmetry.crystal_system.value == lattice] | ||
return docs | ||
|
||
def query_material_property(self, formula, desired_prop, space_group_num=None, density=None, lattice=None, one_only=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still a little confused about how it'll be implemented in the agent. Is the agent supposed to decide if they will query one_only
or not?
If the user/agent should have access to it, the warning of returning one_only
should be in this function. If this will be a private function and the control will be done in the get_prop
function, then we don't need the one_only
argument here
all_values = self._get_prop(all_values, property) | ||
return sum(all_values) / len(all_values) | ||
|
||
def k_vals(self, input_param:dict, k:int): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd change this fxn name for something like has_k_vals
. It took me a while to understand it was checking if there's enough examples to fill the context already
all_props = self._get_prop(docs, desired_prop) | ||
if not docs: | ||
print ("Warning: no matches found. Returning the average of closet matches.") | ||
nearest_neighbor_avg = self.get_weighted_average_of_neighbors({"formula": formula, "desired_prop": desired_prop, "space_group_num": space_group_num, "density": density, "lattice": lattice}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to compare this average pooling scheme with the ICL to decide which one to keep. But I think it's ok to keep that for now
In this PR:
Tools can now accept the following parameters:
any of the non-required parameters are left out of the query
Alternative to ICL:
I added a demo notebook for this as well