Packages

Page

sentenceR
R package for (almost) language-agnostic sentence tokenization

sentenceR is a language-agnostic utility designed for sentence tokenization of raw text. Using the UDPipe POS tagging pipeline, the package automatically extracts sentences with their appropriate indexes (hence the “crowbar” logo as a reference to extraction). The package works with any of the 100+ language models natively provided by UDPipe package (for more information and installation instructions, see GitHub repository).


bws

R package for bootstrapping wrodscores models

bws is a bootstrapping utility designed for stabilizing scaling scores across different reference documents. Built on top of quanteda’s wordscores function, the package automatically scales multiple wordscores models using user-defined pairs of reference documents and averages the results as stabilized scaling scores (for more information and installation instructions, see GitHub repository).