Skip to content

Signed Mutual Information #43

@tetomonti

Description

@tetomonti

[Note: I'm adding the content of my email here for record keeping]

The reason revealer returns a signed MI is because it multiplies the actual MI by the sign of the features’ correlation.
In the code, you will see that cond_mutual_inf has the step (line 202):

CIC <- sign(rho) * sqrt(1 - exp(-2 * CMI))`

And and the mutual_inf_v2 function has the step (line 248)

IC <- sign(rho) * sqrt(1 - exp(-2 * MI))`

Which basically multiplies the MI by the sign of the correlation (rho) between the two variables.

I think we can do the same in our knnmi-based score. In order not to lose efficiency, we could call the cor function on the entire set of features. i.e., when computing the MI between X and all the remaining features, say, REST, do something like

MI <- knnmi(X,REST,Z)
RHO <- cor(X,REST)
SMI <- MI * sign(RHO)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions