Reproducibility of experiments in recommender systems evaluation

Nikolaos Polatidis; Stylianos Kapetanakis; Elias Pimenidis; Konstantinos Kosmidis

Reproducibility of experiments in recommender systems evaluation

Nikolaos Polatidis, Stylianos Kapetanakis, Elias Pimenidis, Konstantinos Kosmidis

University of Brighton

Research output: Chapter in Book/Conference proceeding with ISSN or ISBN › Conference contribution with ISSN or ISBN › peer-review

Abstract

Recommender systems evaluation is usually based on predictiveaccuracy metrics with better scores meaning recommendations of higherquality. However, the comparison of results is becoming increasingly difficult,since there are different recommendation frameworks and different settings inthe design and implementation of the experiments. Furthermore, there might beminor differences on algorithm implementation among the differentframeworks. In this paper, we compare well known recommendationalgorithms, using the same dataset, metrics and overall settings, the results ofwhich point to result differences across frameworks with the exact samesettings. Hence, we propose the use of standards that should be followed asguidelines to ensure the replication of experiments and the reproducibility ofthe results.

Original language	English
Title of host publication	14th International Conference on Artificial Intelligence Applications and Innovations
Place of Publication	Germany
Publisher	Springer-Verlag
Pages	401-409
Number of pages	9
Volume	519
Publication status	Published - 22 May 2018
Event	14th International Conference on Artificial Intelligence Applications and Innovations - Rhodes, Greece, 25-27 May 2018 Duration: 22 May 2018 → …

Publication series

Name	IFIP Advances in Information and Communication Technology

Conference

Conference	14th International Conference on Artificial Intelligence Applications and Innovations
Period	22/05/18 → …

Bibliographical note

This is a post-peer-review, pre-copyedit version of an article published in IFIP Advances in Information and Communication Technology. The final authenticated version is available online at: http://dx.doi.org/10.1007/978-3-319-92007-8_34

Keywords

Recommender systems
Evaluation
Reproducibility
Replication

Access to Document

LNCS_Reproducibility_finalVersion.pdfAccepted author manuscript, 292 KBLicence: Unspecified

https://link.springer.com/chapter/10.1007/978-3-319-92007-8_34Licence: Unspecified

Nikolaos Polatidis
- N.Polatidisbrighton.acuk
- School of Arch, Tech and Eng - Principal Lecturer
Person: Academic

Cite this

Polatidis, N., Kapetanakis, S., Pimenidis, E., & Kosmidis, K. (2018). Reproducibility of experiments in recommender systems evaluation. In 14th International Conference on Artificial Intelligence Applications and Innovations (Vol. 519, pp. 401-409). (IFIP Advances in Information and Communication Technology). Springer-Verlag. https://link.springer.com/chapter/10.1007/978-3-319-92007-8_34

@inproceedings{fc845f247fc94f31a200c84a421c6d0a,

title = "Reproducibility of experiments in recommender systems evaluation",

abstract = "Recommender systems evaluation is usually based on predictiveaccuracy metrics with better scores meaning recommendations of higherquality. However, the comparison of results is becoming increasingly difficult,since there are different recommendation frameworks and different settings inthe design and implementation of the experiments. Furthermore, there might beminor differences on algorithm implementation among the differentframeworks. In this paper, we compare well known recommendationalgorithms, using the same dataset, metrics and overall settings, the results ofwhich point to result differences across frameworks with the exact samesettings. Hence, we propose the use of standards that should be followed asguidelines to ensure the replication of experiments and the reproducibility ofthe results.",

keywords = "Recommender systems, Evaluation, Reproducibility, Replication",

author = "Nikolaos Polatidis and Stylianos Kapetanakis and Elias Pimenidis and Konstantinos Kosmidis",

note = "This is a post-peer-review, pre-copyedit version of an article published in IFIP Advances in Information and Communication Technology. The final authenticated version is available online at: http://dx.doi.org/10.1007/978-3-319-92007-8_34; 14th International Conference on Artificial Intelligence Applications and Innovations ; Conference date: 22-05-2018",

year = "2018",

month = may,

day = "22",

language = "English",

volume = "519",

series = "IFIP Advances in Information and Communication Technology",

publisher = "Springer-Verlag",

pages = "401--409",

booktitle = "14th International Conference on Artificial Intelligence Applications and Innovations",

}

Polatidis, N, Kapetanakis, S, Pimenidis, E & Kosmidis, K 2018, Reproducibility of experiments in recommender systems evaluation. in 14th International Conference on Artificial Intelligence Applications and Innovations. vol. 519, IFIP Advances in Information and Communication Technology, Springer-Verlag, Germany, pp. 401-409, 14th International Conference on Artificial Intelligence Applications and Innovations, 22/05/18. <https://link.springer.com/chapter/10.1007/978-3-319-92007-8_34>

Reproducibility of experiments in recommender systems evaluation. / Polatidis, Nikolaos; Kapetanakis, Stylianos; Pimenidis, Elias et al.
14th International Conference on Artificial Intelligence Applications and Innovations. Vol. 519 Germany: Springer-Verlag, 2018. p. 401-409 (IFIP Advances in Information and Communication Technology).

Research output: Chapter in Book/Conference proceeding with ISSN or ISBN › Conference contribution with ISSN or ISBN › peer-review

TY - GEN

T1 - Reproducibility of experiments in recommender systems evaluation

AU - Polatidis, Nikolaos

AU - Kapetanakis, Stylianos

AU - Pimenidis, Elias

AU - Kosmidis, Konstantinos

N1 - This is a post-peer-review, pre-copyedit version of an article published in IFIP Advances in Information and Communication Technology. The final authenticated version is available online at: http://dx.doi.org/10.1007/978-3-319-92007-8_34

PY - 2018/5/22

Y1 - 2018/5/22

N2 - Recommender systems evaluation is usually based on predictiveaccuracy metrics with better scores meaning recommendations of higherquality. However, the comparison of results is becoming increasingly difficult,since there are different recommendation frameworks and different settings inthe design and implementation of the experiments. Furthermore, there might beminor differences on algorithm implementation among the differentframeworks. In this paper, we compare well known recommendationalgorithms, using the same dataset, metrics and overall settings, the results ofwhich point to result differences across frameworks with the exact samesettings. Hence, we propose the use of standards that should be followed asguidelines to ensure the replication of experiments and the reproducibility ofthe results.

AB - Recommender systems evaluation is usually based on predictiveaccuracy metrics with better scores meaning recommendations of higherquality. However, the comparison of results is becoming increasingly difficult,since there are different recommendation frameworks and different settings inthe design and implementation of the experiments. Furthermore, there might beminor differences on algorithm implementation among the differentframeworks. In this paper, we compare well known recommendationalgorithms, using the same dataset, metrics and overall settings, the results ofwhich point to result differences across frameworks with the exact samesettings. Hence, we propose the use of standards that should be followed asguidelines to ensure the replication of experiments and the reproducibility ofthe results.

KW - Recommender systems

KW - Evaluation

KW - Reproducibility

KW - Replication

M3 - Conference contribution with ISSN or ISBN

VL - 519

T3 - IFIP Advances in Information and Communication Technology

SP - 401

EP - 409

BT - 14th International Conference on Artificial Intelligence Applications and Innovations

PB - Springer-Verlag

CY - Germany

T2 - 14th International Conference on Artificial Intelligence Applications and Innovations

Y2 - 22 May 2018

ER -

Reproducibility of experiments in recommender systems evaluation

Abstract

Publication series

Conference

Bibliographical note

Keywords

Access to Document

Fingerprint

Profiles

Nikolaos Polatidis

Cite this