Compared to the example of the medical records, Netflix had been very careful not to add any data that could identify a user, like zip-code, birthdate, and of course name, personal IDs, etc. Nevertheless, only a couple of weeks after the release, another PhD student, Arvind Narayanan, announced that they (together with his advisor Vitaly Shmatikov), had been able to connect many of the unique IDs in the Netflix dataset to real people, by cross referencing another publicly available dataset: the movie ratings in the IMDB site, where many users post publicly with their own names.
https://medium.com/@EmiLabsTech/data-privacy-the-netflix-pri...
Compared to the example of the medical records, Netflix had been very careful not to add any data that could identify a user, like zip-code, birthdate, and of course name, personal IDs, etc. Nevertheless, only a couple of weeks after the release, another PhD student, Arvind Narayanan, announced that they (together with his advisor Vitaly Shmatikov), had been able to connect many of the unique IDs in the Netflix dataset to real people, by cross referencing another publicly available dataset: the movie ratings in the IMDB site, where many users post publicly with their own names.
https://www.cs.utexas.edu/~shmat/shmat_oak08netflix.pdf
https://courses.csail.mit.edu/6.857/2018/project/Archie-Gers...