Genres in formation? An exploratory study of web pages using cluster analysis

M. Santini

Research output: Chapter in Book/Conference proceeding with ISSN or ISBNConference contribution with ISSN or ISBN

Abstract

The Web is a new, large and heterogeneous community where the interaction among the users and the possibility offered by technology may modify existing genres or create new ones. In fact, most genres being borrowed from the paper world have undergone adjustments when moving on to the Web (for instance, online newspapers and online manuals). Also, there is a family of genres, which have been created specifically for the Web, e.g. home pages, splash screens, newsletters, hotlists. Besides these, are there other emerging genres on the Web for which a genre label has not been coined yet? Is it possible to capture genres in formation in an automated way? An experiment using cluster analysis has been set up to provide initial answers to these questions. Results show that the main clusters have a shape which is quite well-defined and show a number of regularities. Interestingly, Web pages appear to have been clustered according to their rhetorical/discoursal types (informational, instructional, argumentative, etc.), rather than genre classes (e.g. sermons and editorials, both argumentative, belong to the same cluster). The perception of rhetorical/discoursal types in Web pages has been confirmed by a small-scale Web user study.
Original languageEnglish
Title of host publicationProceedings of the 8th annual colloquium for the UK special interest group for computational linguistics (CLUK05)
Publication statusPublished - 2005
EventProceedings of the 8th annual colloquium for the UK special interest group for computational linguistics (CLUK05) - Manchester, UK
Duration: 1 Jan 2005 → …

Conference

ConferenceProceedings of the 8th annual colloquium for the UK special interest group for computational linguistics (CLUK05)
Period1/01/05 → …

Fingerprint

Cluster analysis
Websites
World Wide Web
Labels
Experiments

Bibliographical note

Article freely available on author's homepage.

Cite this

Santini, M. (2005). Genres in formation? An exploratory study of web pages using cluster analysis. In Proceedings of the 8th annual colloquium for the UK special interest group for computational linguistics (CLUK05)
Santini, M. / Genres in formation? An exploratory study of web pages using cluster analysis. Proceedings of the 8th annual colloquium for the UK special interest group for computational linguistics (CLUK05). 2005.
@inproceedings{d352c9393eac47659af9ff8547fe8b9c,
title = "Genres in formation? An exploratory study of web pages using cluster analysis",
abstract = "The Web is a new, large and heterogeneous community where the interaction among the users and the possibility offered by technology may modify existing genres or create new ones. In fact, most genres being borrowed from the paper world have undergone adjustments when moving on to the Web (for instance, online newspapers and online manuals). Also, there is a family of genres, which have been created specifically for the Web, e.g. home pages, splash screens, newsletters, hotlists. Besides these, are there other emerging genres on the Web for which a genre label has not been coined yet? Is it possible to capture genres in formation in an automated way? An experiment using cluster analysis has been set up to provide initial answers to these questions. Results show that the main clusters have a shape which is quite well-defined and show a number of regularities. Interestingly, Web pages appear to have been clustered according to their rhetorical/discoursal types (informational, instructional, argumentative, etc.), rather than genre classes (e.g. sermons and editorials, both argumentative, belong to the same cluster). The perception of rhetorical/discoursal types in Web pages has been confirmed by a small-scale Web user study.",
author = "M. Santini",
note = "Article freely available on author's homepage.",
year = "2005",
language = "English",
booktitle = "Proceedings of the 8th annual colloquium for the UK special interest group for computational linguistics (CLUK05)",

}

Santini, M 2005, Genres in formation? An exploratory study of web pages using cluster analysis. in Proceedings of the 8th annual colloquium for the UK special interest group for computational linguistics (CLUK05). Proceedings of the 8th annual colloquium for the UK special interest group for computational linguistics (CLUK05), 1/01/05.

Genres in formation? An exploratory study of web pages using cluster analysis. / Santini, M.

Proceedings of the 8th annual colloquium for the UK special interest group for computational linguistics (CLUK05). 2005.

Research output: Chapter in Book/Conference proceeding with ISSN or ISBNConference contribution with ISSN or ISBN

TY - GEN

T1 - Genres in formation? An exploratory study of web pages using cluster analysis

AU - Santini, M.

N1 - Article freely available on author's homepage.

PY - 2005

Y1 - 2005

N2 - The Web is a new, large and heterogeneous community where the interaction among the users and the possibility offered by technology may modify existing genres or create new ones. In fact, most genres being borrowed from the paper world have undergone adjustments when moving on to the Web (for instance, online newspapers and online manuals). Also, there is a family of genres, which have been created specifically for the Web, e.g. home pages, splash screens, newsletters, hotlists. Besides these, are there other emerging genres on the Web for which a genre label has not been coined yet? Is it possible to capture genres in formation in an automated way? An experiment using cluster analysis has been set up to provide initial answers to these questions. Results show that the main clusters have a shape which is quite well-defined and show a number of regularities. Interestingly, Web pages appear to have been clustered according to their rhetorical/discoursal types (informational, instructional, argumentative, etc.), rather than genre classes (e.g. sermons and editorials, both argumentative, belong to the same cluster). The perception of rhetorical/discoursal types in Web pages has been confirmed by a small-scale Web user study.

AB - The Web is a new, large and heterogeneous community where the interaction among the users and the possibility offered by technology may modify existing genres or create new ones. In fact, most genres being borrowed from the paper world have undergone adjustments when moving on to the Web (for instance, online newspapers and online manuals). Also, there is a family of genres, which have been created specifically for the Web, e.g. home pages, splash screens, newsletters, hotlists. Besides these, are there other emerging genres on the Web for which a genre label has not been coined yet? Is it possible to capture genres in formation in an automated way? An experiment using cluster analysis has been set up to provide initial answers to these questions. Results show that the main clusters have a shape which is quite well-defined and show a number of regularities. Interestingly, Web pages appear to have been clustered according to their rhetorical/discoursal types (informational, instructional, argumentative, etc.), rather than genre classes (e.g. sermons and editorials, both argumentative, belong to the same cluster). The perception of rhetorical/discoursal types in Web pages has been confirmed by a small-scale Web user study.

M3 - Conference contribution with ISSN or ISBN

BT - Proceedings of the 8th annual colloquium for the UK special interest group for computational linguistics (CLUK05)

ER -

Santini M. Genres in formation? An exploratory study of web pages using cluster analysis. In Proceedings of the 8th annual colloquium for the UK special interest group for computational linguistics (CLUK05). 2005