mplot.lm() for models where
broom::tidy() doesn’t record residuals.mosaicCore, makeFun()
now accepts one-sided formulas such as makeFun(~ x^2).p = 0 or p = 1 to the
q-function throws an error. [See #779]compare(), and
design_plot().mplot(model, which = 1) now uses raw residuals rather
than standardized/studentized. This mathes behavior in
plot().na.rm argument to prop.test().qdata() so that it is always a
named vector.cdata() so that is is always a
data frame. Also changed names to “lo” and “hi”.xpchisq() caused by introducing explicit
arguments and failing to retain .... (Issue #737)xpt() caused by introducing explicit
arguments and failing to handle missing ncp correctly.
(Issue #736)googleMap() has be deprecated due to change in policy
at google. Try leaflet_map() as an alternative.do().xpt(), xqt(), etc. now have
more explicit arguments. This provides additional help and prompts for
the user.percs() and counts() are re-exported from
mosaicCoreconfint(), attempting to set the confidence
level using conf.level instead of level throws
and error and provides a reminder to use level for that
purpose.confint() methods for binom.test() have
been modified a bit. See documentation for how names map to
methods.ggformula is used for plotting in more places (replacing
older lattice code).CIdata() now handles negative numbers
correctly.mplot.lm() now removed points with leverage 1
to avoid errors and warnings; a warning messages notifies which points
have been removed.TukeyHSD() now correctly follows
system = "gg"mplot.lm() now uses ggrapel to place
labels and offers additional controls for the smooth curve that is
overlaid. [gg version of plots only]orrr(), oddsRatio(), and
relrisk() now accept a 2x2 data frame to match claims in
documentation.cor(~y, ~x)prop.test() so it handles
success argument properly for 2-way tables.ggformula.which argument added to
mplot.TukeyHSD().ggformula.mosaic compatible with
ggplot2 version 3.0.ggplot2 rather
than lattice by default.cnorm(), ct(), xcnorm(), and
xct() added to find central portions of distributions.mosaicCoremosaicCore.ggplot2 rather than
lattice.mplot() on linear models when system =
"gg".formals().xpnorm() and friends now use ggplot2 and
can return the plot object, if requested.t.test() has been completely reimplemented. It no
longer supports “bare variable mode”, but it is more similar to
stats::t.test() in some cases.gwm() has been removed since it no longer works with
the current version of dplyr.mosaicModel package.props() and counts() have been added. They
are a bit like tally() but designed to play well with
df_stats(). Currently the formula versions drop missing
data, but that will likely be determined by a user-supplied option in
the future.mosaicCalc.mosaic depends on ggformula, so users will
have lattice, ggplot2, and
ggformula available after loading mosaic.mplot() on a data frame supports ggformula
now.ggformula has been
added.lattice and ggformula
has been added.mosaic to
mosaicCore. This should not affect users of
mosaic.tally() now provide names to dimnames in
cases where they were previously missing. This was needed for the
refactoring of bargaph().bargraph() to use tally() for
tabulation. This means the behavior of bargraph() should
match expectations of users of tally() better than it did
before. In particular, proportions now sum to 1 in each panel of a
multi-panel plot.tally() so the proportions computed when
format = "proportion" are easier to predict.prop(x ~ y) was reporting overall proportions
rather than marginal proportions.value(), a generic with several methods for
extracting a “value” from a more complicated object. Useful for
extracting values from output of uniroot(),
nlm(), integrate(),
cubature::adaptIntegrat() without needing to know just how
those values are stored in the object.prop(a ~ b) to compute joint rather
than conditional proportions.favstats(), mean(),
sd(), etc.) now require that the first argument be a
formula. This was always the preferred method, but some functions
allowed bare variable names to be used instead. As a specific example,
the following code now generates an error (unless there is another
object named age in your environment).favstats(age, data = HELPrct)
## Error in typeof(x) : object 'age' not foundReplace this with
favstats( ~ age, data = HELPrct)
##  min Q1 median Q3 max     mean       sd   n missing
##   19 30     35 40  60 35.65342 7.710266 453       0ggplot2.mplot.data.frame() allow it to work
with an expression that evaluates to a data frame. ASH plots are now a
choice for 1-variable plots.deltaMethod() has been moved to a separate package
(called deltaMethod) to reduce package dependenciescull_for_do.lm() now returns a data frame instead of a
vector. This makes it easier for do() to bind things
together by column name.makeMap() updated to work with new version of
ggplot2.cdata(), ddata(),
pdata(), qdata() and rdata() have
been reordered so that the formula comes first.rflip() has
been improved.dfapply(), also default value for
select changed to TRUE.inspect(), which is primarily intended to
give an over view of the variables in a data frame, but handles some
additional objects as well.data argument is not an environment or data frame.mm() has been deprecated and replaced with
gwm() which does groupwise models where the response may be
either categorical or quantitative.plotModel(). This is
likely still not the final version, but we are getting closer.do().dotPlot() are now the same size in all panels
of multi-panel plots.cdist() has been rewritten.mplot() on a data frame now (a) prompts the user for
the type of plot to create and (b) has an added option to make line
plots for time series and the like.resample() can now do residual resampling from a linear
model.do() to create common bootstrap confidence intervals. In
particular, confint() can now calculate three kinds of
intervals in many common situations.fetchData(), fetchGoogle(), and
fetchGapminder() have been moved to a separate package,
called fetch().plotModel() can be used to show data and model fits for
a variety of models created with lm() or
glm().mosaicData a dependency of mosaic. This
avoids the problem of users forgetting to separately load the
mosaicData package.fetchGoogle() (and perhaps
read.file()) from future versions of the package. More and
more packages are providing utilities for bringing data into R and it
doesn’t make sense for us to duplicate those efforts in this package.
For google sheets, you might take a look at the
googlesheets package which is available via github now and
will be on CRAN soon.binom.test(),
prop.test(), and t.test(), which have also
undergone some internal restructuring. The objects returned now do a
better job of reporting about the test conducted. In particular,
binom.test() and prop.test() will report the
value of success used.(#450, #455)binom.test() can now compute several different kinds of
confidence intervals including the Wald, Plus-4 and Agresti-Coull
intervals. (#449)derivedFactor() now handles NAs without throwing a
warning. (#451)pdist(), pdist() and related
functions now do a better (i.e., useful) job with discrete distributions
(#417)t.test()
and all the “aggregating” functions like mean() and
favstats(). In particular, it is now possible to reference
variables both in the data argument and in the calling
environment. (#435)CIAdata() now provides a message indicating the source
URL for the data retrieved (#444)CIAdata() that seem to be related to a
changed in file format at the CIA World Factbook website. The
“inflation” data set is still broken (on the CIA website). (#441)read.file() now uses functions from readr
in some cases. A message is produced indicating which reader is being
used. There are also some API changes. In particular, character data
will be returned as character rather than factor. See
factorize() for an easy way to convert things with few
unique values into factors. (#442)mutate() is used in place of transform()
in the examples. (#452)tally() now produces counts by default for all formula
shapes. Proportions or percentages must be requested explicitly. This is
to avoid common errors, especially when feeding the results into
chisq.test().msummary(). Usually this is identical
to summary(), but for a few kids of objects it provides
modified output that is less verbose.do * lm( ) will now keep track of the F
statistic, too.
confint() applied to an object produced using
do() now does more appropriate things.binom.test() and prop.test() now set
success = 1 by default on 0-1 data to treat 0 like failure
and 1 like success. Similarly, prop() and
count() set level = 1 by default.CIsim() can now produce plots and does so by default
when samples <= 200.add=TRUE improved for
plotDist().swap() which is useful for creating randomization
distributions for paired designs. The current implementation is a bit
slow.MAD(),
SAD(), and quantile().docFile() introduced to simplify accessing files
included with package documentation. read.file() enhanced
to take a package as an argument and look among package documentation
files.factorize() introduced as a way to convert vectors with
few unique values into factors. Can be applied to an entire data
frame.NHANES contains the
NHANES data set and mosaicData contains the
other data sets.MAD() and SAD() were added to compute mean
and sum of all pairs of absolute differences.rspin() has been added to simulate spinning a
spinner.mosaic package to simplify R for
beginners.mosaic package.plotFun() has been improved so that it does a better
job of selecting points where the function is evaluated and no longer
warns about NaNs encountered while exploring the domain of
the function.oddsRatio() has been redesigned and
relrisk() has been added. Use their summary()
methods or verbose=TRUE to see more information (including
confidence intervals).Birthdays data set.mplot() and several instances have been added
to make a number of plots easy to generate. There are methods for
objects of classes "data.frame", "lm",
"summary.lm", "glm",
"summary.glm", "TukeyHSD", and
"hclust". For several of these there are also
fortify methods that return the data frame created to
facilitate plotting.read.file() now handles (some?) https URLs and accepts
an optional argument filetype that can be used to declare
the type of data file when it is not identified by extension.useNA in the tally()
function has changed to "ifany".mosaic now depends on dplyr both to use
some of its functionality and to avoid naming collisions with functions
like tally() and do(), allowing
mosaic and dplyr to coexist more happily.dotPlot(). In
particular, the size of the dots is determined differently and works
better more of the time. Dots were also shifted down by .5 units so that
theydo() that caused it to scope incorrectly
in some edge cases when a variable had the same name as a function.ntiles() has been reimplemented and now has more
formatting options.derivedFactor() for creating factors
from logical “cases”.HELP data set has been removed from the
package.HELPrct instead.plotDist() now accepts add=TRUE and
under=TRUE, making it easy to add plots of distributions
over (or under) plots of data (e.g., histograms, densityplots, etc.) or
other distributions.add=TRUE have
been reimplemented using layer from
latticeExtra. See documentation of these functions for
details.ladd() has been completely reimplemented using
layer() from latticeExtra. See documentation
of ladd() for details, including some behavior
changes.mean(), sd(),
var(), et al) now use
getOptions("na.rm") to determine the default value of
na.rm. Use options(na.rm=TRUE) to change the
default behavior to remove NAs and options(na.rm=NULL) to
restore defaults.do() has been largely rewritten with an eye toward
improved efficiency. In particular, do() will take
advantage of multiple cores if the parallel package is
available. At this point, sluggishness in applications of
do() are mostly likely due to the sluggishness of what is
being done, not to do() itself.deltaMethod() from the
car package to make it easier to propagate uncertainty in
some situations that commonly arise in the physical sciences and
engineering.cdist() to compute critical values for the
central portion of a distribution.qdata(). For interactive
use, this should not cause any problems, but old programmatic uses of
qdata() should be checked as the object returned is now
different.sum(),
mean(), sd(), etc.) to produce
counter-intuitive results (but with a warning). The results are now what
one would expect (and the warning is removed).rsquared() for extracting r-squared from models
and model-like objects (r.squared() has been
deprecated).do() now handles ANOVA-like objects bettermaggregate() is now built on some improved behind the
scenes functions. Among other features, the groups argument
is now incorporated as an alternative method of specifying the groups to
aggregate over and the method argument can be set to
"ddply" to use ddply() from the
plyr package for aggregation. This results in a different
output format that may be desired in some applications.
The cdata(), pdata() and qdata()
functions have been largely rewritten. In addition,
cdata_f(), pdata_f() and
qdata_f() are provided which produce similar results but
have a formula in the first argument slot.doc/ and so are available from within the package as well
as via links to external files.fetchGapminder() for fetching data sets
originally from Gapminder.cdata() for finding end points of a central
portion of a variable.prop() to avoid internal
: which makes downstream processing messier.manipulate()
(RStudio)plotFun() can be used without
manipulate(). This makes it possible to put surface plots
into RMarkdown or Rnw files or to generate them outside of RStudio.do() * rflip() now records proportion heads as well as
counts of heads and tails.mosaicLatticeOptions() and
restoreLatticeOptions() to switch back and forth between
lattice defaults and mosaic defaults.dotPlot() uses a different algorithm to determine dot
sizes. (Still not perfect, but cex can be used to further
scale the dots.)histogram() so that nint
matches the number of bins used more accurately.i2: max
number of drinks is at least as large as i1: the average
number of drinks.D() and
antiD().mPlot() provides an interactive environment
for creating lattice and ggplot2 plots.sp2df() for converting SpatialPolygonDataFrames to regular
data frames (which is useful for plotting with ggplot2, for
example). Also the Countries data frame facilitates mapping
country names among different sources of map data.do() are now marked as such so
that confint() can behave differently for such data frames
and for “regular” data frames.t.test() can now do 1-sample t-test described using a
formula.mean(), var(),
etc. using a formula interface) have been completely reimplemented and
additional aggregating functions are provided.ntiles() function has been added to facilitate
creating factors based on quantile ranges.RailTrail dataset.xhistogram() is now deprecated. Use
histogram() instead.mean(), max(),
median(), var(), etc.) now use
getOption('na.rm') to determine default behavior.var() allow it to work in a wider
range of situations.TukeyHSD() so that explicit use of
aov() is no longer requiredpanel.lmbands() for plotting confidence and
prediction bands in linear regressionAnimals from MASS has been removed by renaming
the data set GestationLongevity.freqpolygon() for making frequency polygons.r.squared() for extracting r-squared from models
and model-like objects.do() so that
hyphens (‘-’) are turned into dots (‘.’)fetchData().We are still in beta, but we hope things are beginning to stabilize as we settle on syntax and coding idioms for the package. Here are some of the key updates since 0.4:
lm() and its cousins.makeFun() now has methods for glm and nls objectsD() improved to use symbolic differentiation in more
cases and allow pass through to stats::D() when that makes
sense. This allows functions like deltaMethod() from the car package to
work properly even when the mosaic package is loaded.antiD() has been modified somewhat. This
may go through another revision if/when we add in symbolic
differentiation, but we think we are now close to the end state.fitSpline() and fitModel() have been added
as wrappers around linear models using ns(), bs(), and nls(). Each of
these returns the model fit as a function.