[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[InetBib] DBpedia Open Text Extraction Challenge - TextExt

Date: Fri, 3 Mar 2017 15:10:08 +0100
From: Sebastian Hellmann via InetBib <inetbib@xxxxxxxxxx>
Subject: [InetBib] DBpedia Open Text Extraction Challenge - TextExt

*DBpedia Open Text Extraction Challenge - TextExt*

Website: http://wiki.dbpedia.org/textext

*_Disclaimer: The call is under constant development, please refer tothe news section. We also acknowledge the initial engineering effort andwill be lenient on technical requirements for the first submissions andwill focus evaluation on the extracted triples and allow latesubmissions, if they are coordinated with us_*.



     Background

DBpedia and Wikidata currently focus primarily on representing factualknowledge as contained in Wikipedia infoboxes. A vast amount ofinformation, however, is contained in the unstructured Wikipedia articletexts. With the DBpedia Open Text Extraction Challenge, we aim to spurknowledge extraction from Wikipedia article texts in order todramatically broaden and deepen the amount of structuredDBpedia/Wikipedia data and provide a platform for benchmarking variousextraction tools.



     Mission

Wikipedia has become the ubiquitous source of knowledge for the worldenabling humans to lookup definitions, quickly become familiar with newtopics, read up background infos for news event and many more - evensettling coffee house arguments via a quick mobile research. The missionof DBpedia in general is to harvest Wikipedia’s knowledge, refine andstructure it and then disseminate it on the web - in a free and openmanner - for IT users and businesses.



     News and next events

Twitter: Follow @dbpedia <https://twitter.com/dbpedia>, Hashtag:#dbpedianlp <https://twitter.com/search?f=tweets&q=%23dbpedianlp&src=typd>


 *

   LDK <http://ldk2017.org/> conference joined the challenge (Deadline
   March 19th and April 24th)

 *

   SEMANTiCS <http://2017.semantics.cc/> joined the challenge (Deadline
   June 11th and July 17th)

 *

   Feb 20th, 2017: Full example added to this website

 *

   March 1st, 2017: Docker image (beta)
   https://github.com/NLP2RDF/DBpediaOpenDBpediaTextExtractionChallenge

Coming soon:

 *

   beginning of March: full example within the docker image

 *

   beginning of March: DBpedia full article text and tables (currently
   only abstracts) http://downloads.dbpedia.org/2016-10/core-i18n/


     Methodology

The DBpedia Open Text Extraction Challenge differs significantly fromother challenges in the language technology and other areas in that itis not a one time call, but a continuous growing and expanding challengewith the focus to *sustainably* advance the state of the art andtranscend boundaries in a *systematic* way. The DBpedia Association andthe people behind this challenge are committed to provide the necessaryinfrastructure and drive the challenge for an indefinite time as well aspotentially extend the challenge beyond Wikipedia.

We provide the extracted and cleaned full text for all Wikipediaarticles from 9 different languages in regular intervals for downloadand as Docker in the machine readable NIF-RDF<http://persistence.uni-leipzig.org/nlp2rdf/> format (Example forBarrack Obama in English<https://github.com/NLP2RDF/DBpediaOpenDBpediaTextExtractionChallenge/blob/master/BO.ttl>).Challenge participants are asked to wrap their NLP and extractionengines in Docker images and submit them to us. We will runparticipants’ tools in regular intervals in order to extract:


1.

   Facts, relations, events, terminology, ontologies as RDF triples
   (Triple track)

2.

   Useful NLP annotations such as pos-tags, dependencies, co-reference
   (Annotation track)

We allow submissions 2 months prior to selected conferences (currently_http://ldk2017.org/_ and _http://2017.semantics.cc/_ ). Participantsthat fulfil the technical requirements and provide a sufficientdescription will be able to present at the conference and be included inthe yearly proceedings. *Each conference, the challenge committee willselect a winner among challenge participants, which will receive 1000€. *



     Results

Every December, we will publish a summary article and proceedings ofparticipants’ submissions at _http://ceur-ws.org/_ . The firstproceedings are planned to be published in Dec 2017. We will try tobriefly summarize any intermediate progress online in this section.



     Acknowledgements

We would like to thank the Computer Center of Leipzig University to giveus access to their 6TB RAM server Sirius to run all extraction tools.

The project was created with the support of the H2020 EU project HOBBIT<https://project-hobbit.eu/> (GA-688227) and ALIGNED<http://aligned-project.eu/> (GA-644055) as well as the BMWi projectSmart Data Web <http://smartdataweb.de/> (GA-01MD15010B).



     Challenge Committee

 *

   Sebastian Hellmann, AKSW, DBpedia Association, KILT Competence
   Center, InfAI, Leipzig

 *

   Sören Auer, Fraunhofer IAIS, University of Bonn

 *

   Ricardo Usbeck, AKSW, Simba Competence Center, Leipzig University

 *

   Dimitris Kontokostas, AKSW, DBpedia Association, KILT Competence
   Center, InfAI, Leipzig

 *

   Sandro Coelho, AKSW, DBpedia Association, KILT Competence Center,
   InfAI, Leipzig

Contact Email: _dbpedia-textext-challenge@infai.org_<mailto:dbpedia-textext-challenge@xxxxxxxxx>

Prev by Date: [InetBib] Stellenausschreibung der Technischen Informationsbibliothek (TIB) Hannover; Sachbearbeiter/in im Bereich Finanzen, E 9 TV-L, unbefristet, Vollzeit, teilzeitgeeignet, 14/2017
Next by Date: [InetBib] Stellenausschreibungen an der UB der Humboldt-Universität zu Berlin
Previous by thread: [InetBib] DBpedia Tutorial @ LDK conference on Sept 1st
Next by thread: [InetBib] DBpedia Newsletter
Index(es):
- Date
- Thread

Listeninformationen unter http://www.inetbib.de.