Skip to content

Conversation

SamCox822
Copy link
Collaborator

In this PR:

  • Tools can now accept the following parameters:

    • formula (required)
    • lattice
    • space group number
    • density
    • requested property (required)
  • any of the non-required parameters are left out of the query

  • Alternative to ICL:

    • Now when a query gives back multiple results, you get a warning & the average of the desired property for all of the results -> this is default behavior, but is an optional flag
    • If a query returns no results, a neighbor search (at least k=10 neighbors) / weighted averaging is done to estimate the property (and user is given a warning). The weights for the weighted average are basically 3/4 matching properties=3, 2/4 matching properties=2, etc.

I added a demo notebook for this as well

@SamCox822 SamCox822 requested a review from maykcaldas April 19, 2024 06:11
Copy link
Owner

@maykcaldas maykcaldas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything seems good. There's just one test I think should be changed, and a comment on the unnecessary complexity of the query_material_property method.

density=4.7
prop="band_gap"

output = exact_match.query_material_property(formula, prop, one_only=False)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the potential to fail in future as new materials are added to materials project.
Maybe assert len(output) >= 9 is a better test

docs = [doc for doc in docs if doc.symmetry.crystal_system.value == lattice]
return docs

def query_material_property(self, formula, desired_prop, space_group_num=None, density=None, lattice=None, one_only=True):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still a little confused about how it'll be implemented in the agent. Is the agent supposed to decide if they will query one_only or not?
If the user/agent should have access to it, the warning of returning one_only should be in this function. If this will be a private function and the control will be done in the get_prop function, then we don't need the one_only argument here

all_values = self._get_prop(all_values, property)
return sum(all_values) / len(all_values)

def k_vals(self, input_param:dict, k:int):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd change this fxn name for something like has_k_vals. It took me a while to understand it was checking if there's enough examples to fill the context already

all_props = self._get_prop(docs, desired_prop)
if not docs:
print ("Warning: no matches found. Returning the average of closet matches.")
nearest_neighbor_avg = self.get_weighted_average_of_neighbors({"formula": formula, "desired_prop": desired_prop, "space_group_num": space_group_num, "density": density, "lattice": lattice})
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to compare this average pooling scheme with the ICL to decide which one to keep. But I think it's ok to keep that for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants