Data Sets for Recommender Systems

— a collection of commonly used data sets

Random

To generate a small sample of random data, you can use the [data generator] that we designed.

Declaration
The datasets that we crawled are originally used in our own research and published papers. We make them public and accessible as they may benefit more people's research. Please cite our papers as an appreciation of our efforts in data collection, if you find they are useful to your research.

FilmTrust

FilmTrust is a small dataset crawled from the entire FilmTrust website in June, 2011 [download].

CiaoDVD

CiaoDVD is a dataset crawled from the entire category of DVDs from the dvd.ciao.co.uk website in December, 2013 [download].

A Short List of Recommendation Data Sets

More data sets will be added to the following table.
Data Set Basic Meta User Context Other Contexts
Users Items Ratings (Scale) Density Users Links (Type) Items Labels
Ciao 7,375 99,746 278,483 [1, 5] 0.0379% 7,375 111,781 Trust General  
Douban 129,490 58,541 16,830,839 [1, 5] 0.222% 129,490 1,692,952 Friendship Movie  
Epinions (665K) 40,163 139,738 664,824 [1, 5] 0.0118% 49,289 487,183 Trust General  
Epinions (510K) 71,002 104,356 508,960 [1, 5] 0.00687%     Trust General  
Epinions (Extended) 120,492 755,760 13,668,320 [1, 5] 0.015%     Trust
Distrust
General  
Flixster 147,612 48,794 8,196,077 [0.5, 5.0] 0.1138% 787,213 11,794,648 Friendship Movie  
FilmTrust 1,508 2,071 35,497 [0.5, 4.0] 1.14% 1,642 1,853 Trust Movie  
Jester 59,132 140 1,761,439 Explicit 21.28%       Joke  
MovieLens 100K 943 1,682 100,000 [1, 5] 6.30%       Movie Tag
MovieLens 1M 6,040 3,706 1,000,209 [1, 5] 4.47%       Movie Tag
MovieLens 10M 71,567 10,681 10,000,054 [1, 5] 1.308%       Movie Tag