A Segment-based Weighting Technique for URL-based Genre Classification of Web Pages



Título del documento: A Segment-based Weighting Technique for URL-based Genre Classification of Web Pages
Revista: Polibits
Base de datos: PERIÓDICA
Número de sistema: 000402942
ISSN: 1870-9044
Autores: 1
Instituciones: 1College of Applied Sciences, Ibri, Ad Dhahirah. Omán
Año:
Periodo: Ene-Jun
Número: 53
Paginación: 43-47
País: México
Idioma: Inglés
Tipo de documento: Artículo
Enfoque: Analítico
Resumen en inglés We propose a segment-based weighting technique for genre classification of web pages. This technique exploits character n-grams extracted from the URL of the web page rather than its textual content. The main idea of our technique is to segment the URL and assigns a weight for each segment. Experiments conducted on three known genre datasets show that our method achieves encouraging results
Disciplinas: Ciencias de la computación
Palabras clave: Inteligencia artificial,
Clasificación,
Páginas web,
Localizador uniforme de recursos
Keyword: Computer science,
Artificial intelligence,
Classification,
Web pages,
Uniform resource locator
Texto completo: Texto completo (Ver PDF)