Victor is a web page cleaning tool. It is aimed at removing menu, ads, footers, headers, etc. from HTML web pages, so that only main web page content remains. Victor is based on a conditional random fields algorithm.
THE LINDAT/CLARIN PROJECT (LM2015071 and CZ.02.1.01/0.0/0.0/16_013/0001781; formerly LM2010013) IS FULLY SUPPORTED BY THE MINISTRY OF EDUCATION, SPORTS AND YOUTH OF THE CZECH REPUBLIC UNDER THE PROGRAMME LM OF "LARGE INFRASTRUCTURES".
Copyright (c) 2018 UFAL MFF UK. All rights reserved.