RubyGems Navigation menu

wp2txt 0.5.1

WP2TXT extracts plain text data from Wikipedia dump file (encoded in XML/compressed with Bzip2) stripping all the MediaWiki markups and other metadata.

Gemfile:
=

install:
=

Versions:

  1. 2.1.1 February 21, 2026 (300 KB)
  2. 2.1.0 February 19, 2026 (299 KB)
  3. 1.1.3 May 13, 2023 (7.78 MB)
  4. 1.1.2 April 15, 2023 (7.78 MB)
  5. 1.1.1 January 25, 2023 (7.78 MB)
  6. 0.5.1 January 16, 2013 (279 KB)
Show all versions (31 total)

Runtime Dependencies (5):

bzip2-ruby >= 0
json >= 0
nokogiri >= 0
sanitize >= 0
trollop >= 0

Development Dependencies (1):

rspec >= 0

Owners:

Authors:

  • Yoichiro Hasebe

SHA 256 checksum:

=

Total downloads 71,805

For this version 2,899

Version Released:

Licenses:

N/A

Required Ruby Version: None

Links: