fakeR

Anonymizing Data and Creating Fake Data

Intro A colleague of mine asked whether I had a way to anonymize distribution data that we can get from Teosto’s web service. Since the data contains a lot of sensitive information, something needs to be done in order to protect the privacy of everyone involved, if we want to demonstrate it to a customer or a stakeholder. Also, not that long ago, I happened to see this blog post by Khuyen Tran (Data Science Simplified) about creating fake data in Python.