cwbtools: Tools to Create, Modify and Manage 'CWB' Corpora

The 'Corpus Workbench' ('CWB', <>) offers a classic and mature approach for working with large, linguistically and structurally annotated corpora. The 'CWB' is memory efficient and its design makes running queries fast, see Evert (2011) <>. The 'cwbtools' package offers pure 'R' tools to create indexed corpus files as well as high-level wrappers for the original 'C' implementation of 'CWB' as exposed by the 'RcppCWB' package (<>). Additional functionality to add and modify annotations of corpora from within 'R' makes working with 'CWB' indexed corpora much more flexible and convenient. The 'cwbtools' package in combination with the 'R' packages 'RcppCWB' (<>) and 'polmineR' (<>) offers a lightweight infrastructure to support the combination of quantitative and qualitative approaches for working with textual data.

Version: 0.4.2
Imports: data.table, R6, xml2, stringi, curl, RcppCWB (≥ 0.6.3), pbapply, methods, tools, cli, jsonlite, httr, rstudioapi, zen4R (≥ 0.9), lifecycle, fs
Suggests: tm (≥ 0.7.3), knitr, markdown, tokenizers (≥ 0.2.1), tidytext, SnowballC, janeaustenr, testthat, rmarkdown, aws.s3, quanteda, dplyr
Published: 2024-04-28
DOI: 10.32614/CRAN.package.cwbtools
Author: Andreas Blaette [aut, cre], Christoph Leonhardt [aut]
Maintainer: Andreas Blaette <andreas.blaette at>
License: GPL-3
NeedsCompilation: no
Language: en-US
Citation: cwbtools citation info
Materials: NEWS
CRAN checks: cwbtools results


Reference manual: cwbtools.pdf
Vignettes: Introducing 'cwbtools'


Package source: cwbtools_0.4.2.tar.gz
Windows binaries: r-devel:, r-release:, r-oldrel:
macOS binaries: r-release (arm64): cwbtools_0.4.2.tgz, r-oldrel (arm64): cwbtools_0.4.2.tgz, r-release (x86_64): cwbtools_0.4.2.tgz, r-oldrel (x86_64): cwbtools_0.4.2.tgz
Old sources: cwbtools archive


Please use the canonical form to link to this page.