Replies: 2 comments
-
I'd definitely be up for adding an example to the Splink docs that uses a dataset generated by pseudopeople. Do you have a suggestion of a pre-created dataset that we could use? Or would you suggest that the example both generates and then links a dataset? If it turns out well we could also consider adding to these proposed inbuilt datasets |
Beta Was this translation helpful? Give feedback.
-
Great! I think that a single year of simulated tax data could make a good, minimal example of deduplication --- it includes multiple rows for individuals who held multiple jobs in a year. Linking simulated decennial and tax data could make a good, minimal example of Here is some code to provide a more concrete demonstration of the direction I'm thinking: https://colab.research.google.com/drive/1-vikcDAKXw2pgdvTtV4sw1XSVTgNUzBY?usp=sharing |
Beta Was this translation helpful? Give feedback.
-
From @aflaxman #1358 (comment):
Beta Was this translation helpful? Give feedback.
All reactions