A Segment-based Weighting Technique for URL-based Genre Classification of Web Pages



Document title: A Segment-based Weighting Technique for URL-based Genre Classification of Web Pages
Journal: Polibits
Database: PERIÓDICA
System number: 000402942
ISSN: 1870-9044
Authors: 1
Institutions: 1College of Applied Sciences, Ibri, Ad Dhahirah. Omán
Year:
Season: Ene-Jun
Number: 53
Pages: 43-47
Country: México
Language: Inglés
Document type: Artículo
Approach: Analítico
English abstract We propose a segment-based weighting technique for genre classification of web pages. This technique exploits character n-grams extracted from the URL of the web page rather than its textual content. The main idea of our technique is to segment the URL and assigns a weight for each segment. Experiments conducted on three known genre datasets show that our method achieves encouraging results
Disciplines: Ciencias de la computación
Keyword: Inteligencia artificial,
Clasificación,
Páginas web,
Localizador uniforme de recursos
Keyword: Computer science,
Artificial intelligence,
Classification,
Web pages,
Uniform resource locator
Full text: Texto completo (Ver PDF)