The Web is a new, large and heterogeneous community where the interaction among the users and the possibility offered by technology may modify existing genres or create new ones. In fact, most genres being borrowed from the paper world have undergone adjustments when moving on to the Web (for instance, online newspapers and online manuals). Also, there is a family of genres, which have been created specifically for the Web, e.g. home pages, splash screens, newsletters, hotlists. Besides these, are there other emerging genres on the Web for which a genre label has not been coined yet? Is it possible to capture genres in formation in an automated way? An experiment using cluster analysis has been set up to provide initial answers to these questions. Results show that the main clusters have a shape which is quite well-defined and show a number of regularities. Interestingly, Web pages appear to have been clustered according to their rhetorical/discoursal types (informational, instructional, argumentative, etc.), rather than genre classes (e.g. sermons and editorials, both argumentative, belong to the same cluster). The perception of rhetorical/discoursal types in Web pages has been confirmed by a small-scale Web user study.
|Title of host publication||Proceedings of the 8th annual colloquium for the UK special interest group for computational linguistics (CLUK05)|
|Publication status||Published - 2005|
|Event||Proceedings of the 8th annual colloquium for the UK special interest group for computational linguistics (CLUK05) - Manchester, UK|
Duration: 1 Jan 2005 → …
|Conference||Proceedings of the 8th annual colloquium for the UK special interest group for computational linguistics (CLUK05)|
|Period||1/01/05 → …|