Project: From knowledge extraction to knowledge
representation
There are different tools and techniques that allow to:
Represent knowledge using RDF, RDFS and OWL
Extract knowledge from text using Native Bayes and SVM
In this project, the objective is to build a simple end to end pipeline in order to extract
knowledge from a set of documents, represent this knowledge using rdf, query the created rdf
graph and eventually use reasoning to enrich the extracted knowledge.
Before starting the implementation of your pipeline. You should collect 30 news articles
about different topics.
In order to build the whole pipeline, you will need four subparts in your project :
1. A program that takes as input a document and output the document class
2. A program that takes as input the document, the document ID and the label predicted
by the learned model. This program should output a rdf file.
a. Optional : You can do some topic modelling to extract more knowledge about
the documents
b. Optional : You can also perform Named Entity Recognition using
some pretrained models in NLTK or Spacy
3. A program that contains some generic sparql queries that will be executed on the rdf file.
4. A program that will takes an ontology modeling some class hierarchies about the
labels you have, the rdf file and will perform reasoning in order to enrich the extracted
knowledge
5. Optional : You can run some queries over the dbpedia (or any other Linked Open
Dataset) to get more knowledge about the topics you extract from document.
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。