Language Miner

    1 June, 2005

    2005-2007. Joint project with the Machine Learning and Human-Computer Interfaces Group for the development of language models using self-organized learning processes that can learn from extremely large corpora of unannotated texts with CRM applications.

    Description

    A critical factor to the successful operation of today’s enterprises is to make the documents of the enterprise easily accessible to its employees. The purpose of the “Language Miner” project is to target this market by exploring a new methodology that targets the development of language models using self-organized learning processes that can learn from extremely large corpora of unannotated texts. Using this new approach the influence of ad-hoc elements present in today’s typical language models can be drastically reduced, thereby improving the performance of the new models significantly. The project integrates researchers computational linguistics, mathematics, cognitive science, physics, data mining and machine learning. This multi-disciplinary approach will open up new ways for a possible break-through in language technology. The project not only targets fundamental issues in language modeling, but also specific language technologies and applications building on the newly developed language models. The involvement of industrial partners and end-users ensures that practical end-user requirements will influence the research from the very beginnings of the project.

    Participants

    MTA SZTAKI (Machine learning group and Data Mining and Web Search Group), ELTE (Department of Computer Science, Physics of Complex Systems), BME (Math. Institute Department of Stochastic Analysis), Research Institute for Linguistics (HAS), MTA SZFKI, Omega Consulting, Pont Rendszerház

    Status: 
    Active

    Languages