Packages

Page

sentenceR
R package for (almost) language-agnostic sentence tokenization

sentenceR is a language-agnostic utility designed for sentence tokenization of a raw text. Using the UDPipe POS tagging pipeline, the package automatically extracts sentences with their appropriate indexes (hence the “crowbar” logo as a reference to extraction). The package works with any of the 100+ language models natively provided by UDPipe package (for more information and installation instructions see GitHub repository).