Description
Set of deteriorated versions of the publicly available non-commercial IMDB database, comprising different amounts of duplicates.
The datasets were extracted from PostgreSQL databases including relations titles, name_basics, title_episode, title_ratings, and title_principals from https://datasets.imdbws.com/ IMDB database version downloaded on April 7th, 2024. these databases were deteriorated on purpose to experiment the Red2Hunt method that generates a redundant-free database from any relational operational database comprising surrogate keys and duplicates.
Download instructions
Each file is a full dump of a PostgreSQL database including the schema and the data.
Licence
Publication date
03/06/2024
Author(s)
Mathilde MARCY, Jean-Marc PETIT
Version
version1
Dataset size
70Go - 7 fichiers dump