Wikipedia API

A collection of a number of wikipedia API links to perform a plethora of tasks:


  1. Searching: Search can be done using Opensearch or query on the api
    1. Open search method: Use url: https://en.wikipedia.org/w/api.php?action=opensearch&search=concurrency+java&limit=10&namespace=0&format=json
      https://en.wikipedia.org/w/api.php
      ?action=opensearch
      &search=zyz          # search query
      &limit=1             # return only the first result
      &namespace=0         # search only articles, ignoring Talk, Mediawiki, etc.
      &format=json         # jsonfm prints the JSON in HTML for debugging.
    2. Query method: Use url: https://en.wikipedia.org/w/api.php?action=query&list=search&srsearch=what+is+concurrency+java&utf8=&format=json 
  2. Titles: Once the search yields url (in case of opensearch) or titles in case of query search, each can be individually opened. In case of query search, plain text or w-test can be obtained using https://en.wikipedia.org/w/api.php?format=json&action=query&titles=Java%20concurrency&prop=extracts&exlimit=max&explaintext&exintro or https://en.wikipedia.org/w/api.php?format=json&action=query&titles=Java%20ConcurrentMap&prop=revisions&rvprop=content respectively.
  3. Infoboxes: Info-boxes give wikipedia a proper structured sense, and this hasn't gone unnoticed. These info-boxes can be obtained by either using their API or going the long route of getting the whole dump and parsing it to a graph based DB (DBPedia does exactly this). The codebase for this can be obtained from https://github.com/dbpedia/extraction-framework/. If on the other hand someone is still interested in doing this activity themselved (although there is no point in reinventing the wheel) they can do it by using one of the following URLs in whichever format they feel confortable: 
    1. https://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=xmlfm&titles=Hydrogen&rvsection=0
    2. https://en.wikipedia.org/w/index.php?action=raw&title=Template:Infobox%20hydrogen
    3. https://en.wikipedia.org/w/api.php?action=parse&page=Template:Infobox%20hydrogen&format=json

Comments

Popular posts from this blog

Using cookies with HttpURLConnection

ffmpeg for Google Speech API

SPARQL