gibasa 1.1.2
- Bumped minimum R version to 4.2.0.
- Refactored tagger_implto improve performance oftokenize(split = TRUE).
gibasa 1.1.1
- tokenizenow warns rather than throws an error when an
invalid input is given during partial parsing. With this change,- tokenizeis no longer entirely aborted even if an invalid
string is given. Parsing of those strings is simply skipped.
gibasa 1.1.0
- Corrected probabilistic IDF calculation by
global_idf3.
- Refactored bind_tf_idf2.
- Breaking Change: Changed behavior when
norm=TRUE. Cosine nomalization is now performed ontf_idfvalues as in the RMeCab package.
- Added tf="itf"andidf="df"options.
 
gibasa 1.0.1
- Added wrappers around dictionary compiler of MeCab.
gibasa 0.9.5
- Removed audubon dependency for maintainability.
- packnow preserves- doc_idtype when it’s
factor.
gibasa 0.9.4
- Updated Makevars for Unix alikes. Users can now use a file specified
by the MECABRCenvironment variable or~/.mecabrcto set up dictionaries.
gibasa 0.9.3
- Removed unnecessary C++ files.
gibasa 0.9.2
- Prepare for CRAN release.
gibasa 0.8.1
- For performance, tokenizenow skips resetting the
output encodings to UTF-8.
gibasa 0.8.0
- Breaking Change: Changed numbering style of
‘sentence_id’ when splitisFALSE.
- Added grain_sizeargument totokenize.
- Added new bind_lrfunction.
gibasa 0.7.4
- Use RcppParallel::parallelForinstead oftbb::parallel_for. There are no user’s visible
changes.
gibasa 0.7.1
- Fix documentations. There are no visible changes.
gibasa 0.7.0
- tokenizecan now accept a character vector in addition
to a data.frame like object.
- gbs_tokenizeis now deprecated. Please use the- tokenizefunction instead.
gibasa 0.6.4
gibasa 0.6.3
- Added the partialargument togbs_tokenizeandtokenize. This argument controls the partial parsing
mode, which forces to extract given chunks of sentences when
activated.
gibasa 0.6.2
- More friendly errors are returned when invalid dictionary path was
provided.
- Added new posDebugRcppfunction.
gibasa 0.6.1
- Revert some missing examples.
gibasa 0.6.0
- Functions added in version ‘0.5.1’ was moved to ‘audubon’ package
(>= 0.4.0).
gibasa 0.5.1
- Added some new functions.
- bind_tf_idf2can calculate and bind the term frequency,
inverse document frequency, and tf-idf of the tidy text dataset.
- collapse_tokens,- mute_tokens, and- lexical_densitycan be used for handling a tidy text
dataset of tokens.
 
gibasa 0.5.0
- gibasa now includes the MeCab source, so that users do not need to
pre-install the MeCab library when building and installing the package
(to use tokenize, it still requires MeCab and its
dictionaries installed and available).
gibasa 0.4.1
- tokenizenow preserves the original order of- docid_field.
gibasa 0.4.0
- Added bind_tf_idf2function andis_blankfunction.
gibasa 0.3.1
gibasa 0.3.0
- Changed build process on Windows.
- Added a vignette.
gibasa 0.2.1
- prettifynow can extract columns only specified by- col_select.
gibasa 0.2.0
- Added a NEWS.mdfile to track changes to the
package.
- tokenizenow takes a data.frame as its first argument,
returns a data.frame only. The former function that gets character
vector and returns a data.frame or named list was renamed as- gbs_tokenize.