How to cluster based on filtered predictions? #2358
nickloganfarmer
started this conversation in
General
Replies: 1 comment
-
You can register a new predictions table with the linker by using predict_splink_df = linker.table_management.register_table_predict(predictPandaDf)
clusters = linker.clustering.cluster_pairwise_predictions_at_threshold(predict_splink_df, threshold_match_probability=...) In general you can use |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
After I run the linker.inference.predict function on my dataset, I manually run the predictions through a custom filter process to cut down on incorrect predictions and end up with a Pandas dataframe. However, the clustering seems to need to use the original SplinkDataframe produced from the predict function. Is there a way for me to use the cluster_pairwise_predictions_at_threshold function with my filtered Pandas dataframe and not the entire original SplinkDataframe? Below is an example of my code with my custom "replace_mismatches" function used to filter and return a clean Dataframe:
Beta Was this translation helpful? Give feedback.
All reactions