Downloads

    Roland-Garros 2017 Twitter collection

    We collected tweets for Roland-Garros, the French Open tennis tournament for Days 1-15 in 2017. The dataset contains the mentions between Twitter users as well as the accounts of tennis players who participated in this contest. The schedule of Days 1-15, that was downloaded from the official event website, is also provided as ground truth.

    You can find more information about the dataset in this GitHub repository:
    https://github.com/ferencberes/online-centrality

    OpenCL implementation of Similarity kernel

    An OpenCL implementation of similarity kernel based on various distances:

    • L2
    • L1
    • Jensen-Shannon

    GitHub link: https://github.com/daroczyb/simker

    MOL BUBI Analytics Challenge - training and test data

    - Description of the files can be found at https://dms.sztaki.hu/bubi/#/app/dataset
    - The train and test files has the same columns and format

    Co-cluster

    Co-cluster is a clustering framework implemented in c++. It is capable of clustering and bi-clustering with several different distance measures. It can handle sparse data set effectively. We can run it on multiple input dataset with different distance measure on each input and aggregate the distances with predefined weights. Download from GitHub: http://github.com/siklosid/co-cluster.git

    Correlation Learning

    The source codes below extend the Lemur RankLib toolkit.

    RecSys Challenge 2015 - Team Budapest

    Features

    • session\_time: unix timestamp of the session.
    • session\_hour: hour of the day @session\_time.
    • session\_hour\_threshold: 2, if session\_hour is between 5 and 18 and 1, if session\_hour is between 3-5 or 18-20, and 0 otherwise.
    • session\_day: day of the week @session\_time.
    • session\_length: length of the session in seconds.
    • session\_length\_diff: difference of session\_length from 1,200 sec.

    RecSys Challenge 2014

    Adatminőség javítás és adatintegráció

    Csoportunk adatminőség javító és adatintegrációs megoldásainak rövid összefoglalója.

    Twitter influence subgraphs

    Anonymized Twitter influence subgraphs are available here

    Pages

    Languages