Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

 

usage: extract.py [-h] [-v] [-i] [-u] [-s]

 

extract words from ALTO

 

optional arguments:

  -h, --help      show this help message and exit

  -v, --verbose   increase output verbosity

  -i, --initial   use inital revision (0) instead of latest

  -u, --unlocked  include unlocked documents

  -s, --sanitise  strip non-unicode characters from words

 

 

 

The `extract` tool extracts words from the ALTO XML files in the Revizor collection and stores them in an SQLite3 database. This enables one to use SQL queries to list and research the data in the collection.

...