Temporal Features for Web Spam Detection

    General Information

    Below you can find temporal features for Web spam detection calculated from monthly snapshots of the .uk domain between October 2006 and May 2007. The archives contain files in Weka's ARFF format, one for each snapshot pair, i.e., for October-November 2006, for November-December 2006, etc. The hostname-to-ID assignment used in these files can be found there.

    Please see our paper for an overview of temporal Web spam features.

    Link-based Features

    The below archives contain temporal features derived from link-based similarity metrics and death/growth-rates. Transformed features are also included.

    For inquiries please contact Miklós ErdélyiLast updated: 4 May, 2011.

    Languages