f.hist {DASplusR}R Documentation

Histograms - Scatterplot - Boxplot

Description

The function 'f.hist' plots a histogram of the given data values with or without a scatter- and/or a boxplot beneath it.

Usage

f.hist(data, scatter = TRUE, box = TRUE, P.plot=TRUE, 
       P.main = paste("Histogram of", deparse(substitute(data))),
       P.sub = NULL, P.xlab = deparse(substitute(data)),
       P.ylab = default, P.ann = par("ann"), P.axes = TRUE,
       P.frame.plot = P.axes, B.range = 1.5, B.notch = FALSE,
       B.outline = TRUE, B.border = par("fg"), B.col = NULL,
       B.pch = par("pch"), B.cex = 1, B.bg = NA, H.breaks = "Sturges",
       H.freq = TRUE, H.include.lowest = TRUE, H.right = TRUE,
       H.density = NULL, H.angle = 45, H.col = NULL, H.border = NULL,
       H.labels = FALSE, S.pch = ".", S.col = par("col"), S.bg = NA,
       S.cex = 1)

Arguments

data a numeric vector of length greater than 1, for which the plots plots a desired.
scatter logical; if 'TRUE', a scatterplot is added to the histogramm beneath it.
box logical; if 'TRUE', a boxplot is added to the histogramm beneath it, or beneath the scatterplot if scatter is 'TRUE' as well.
P.plot logical; if 'FALSE' no plot is drawn and just the stats are returned.
P.main a main title for the plot.
P.sub a subtitle for the plot.
P.xlab a label for the x axis.
P.ylab a label for the y axis.
P.ann a logical value indicating whether the default annotation (title and x and y axis labels) should appear on the plot.
P.axes a logical value indicating whether axes should be drawn on the plot.
P.frame.plot a logical value indicating whether a box should be drawn around the plot.
B.range this determines how far the plot whiskers extend out from the box. If range is positive, the whiskers extend to the most extreme data point which is no more than range times the interquartile range from the box. A value of zero causes the whiskers to extend to the data extremes.
B.notch if notch is TRUE, a notch is drawn in each side of the boxes. If the notches of two plots do not overlap then the medians are significantly different at the 5 percent level.
B.outline if outline is not true, the boxplot lines are not drawn.
B.border an optional vector of colors for the outlines of the boxplots. The values in border are recycled if the length of border is less than the number of plots.
B.col if col is non-null it is assumed to contain colors to be used to col the bodies of the box plots.
B.pch either a 'character' or an integrer code for a graphic symbol, which determine the appearance of the values, which lie beyond the whiskers. See also 'points'.
B.cex expansion of the character determined by 'B.pch'; a numerical vector. See also 'points'.
B.bg background ("fill") color for open plot symbols determined by 'B.pch'. See also 'points'.
H.breaks one of:
*
a vector giving the breakpoints between histogram cells,
*
a single number giving the number of cells for the histogram,
*
a character string naming an algorithm to compute the number of cells (see Detailes),
*
a function to compute the number of cells.
H.freq logical; if TRUE, the histogram graphic is a representation of frequencies, the counts component of the result; if FALSE, relative frequencies ('probabilities'), component density, are plotted.
H.include.lowest logical; if TRUE, an x[i] equal to the breaks value will be included in the first (or last, for right = FALSE) bar. This will be ignored (with a warning) unless breaks is a vector.
H.right logical; if TRUE, the histograms cells are right-closed (left open) intervals.
H.density the density of shading lines, in lines per inch. The default value of NULL means that no shading lines are drawn. Non-positive values of density also inhibit the drawing of shading lines.
H.angle the slope of shading lines, given as an angle in degrees (counter-clockwise).
H.col a colour to be used to fill the bars. The default of NULL yields unfilled bars.
H.border the color of the border around the bars. The default is to use the standard foreground color.
H.labels logical or character. Additionally draw labels on top of bars, if not FALSE; see 'plot.histogram'.
S.pch either a 'character' or an integer code for a graphic symbol, which determine the appearance of the dots in the scatterplot. See also 'points'.
S.col color code or namefor the characters determined in 'S.pch', see 'par'.
S.bg background ("fill") color for open plot symbols determined by 'S.pch'. See also 'points'.
S.cex expansion of the character determined by 'S.pch'; a numerical vector. See also 'points'.

Details

The definition of "histogram" differs by source (with country-specific biases). R's default with equi-spaced breaks (also the default) is to plot the counts in the cells defined by 'breaks'. Thus the height of a rectangle is proportional to the number of points falling into the cell, as is the area 'provided' the breaks are equally-spaced.

The default with non-equi-spaced breaks is to give a plot of area one, in which the 'area' of the rectangles is the fraction of the data points falling in the cells.

If 'right = TRUE' (default), the histogram cells are intervals of the form '(a, b]', i.e., they include their right-hand endpoint, but not their left one, with the exception of the first cell when 'include.lowest' is 'TRUE'.

For 'right = FALSE', the intervals are of the form '[a, b)', and 'include.lowest' really has the meaning of "'include highest'". A numerical tolerance of 1e-7 times the range of the breaks is applied when counting entries on the edges of bins.

The default for 'breaks' is '"Sturges"': see 'nclass.Sturges'. Other names for which algorithms are supplied are '"Scott"' and '"FD"' / '"Friedman-Diaconis"' (with corresponding functions 'nclass.scott' and 'nclass.FD'). Case is ignored and partial matching is used. Alternatively, a function can be supplied which will compute the intended number of breaks as a function of 'x'.

Value

an invisible list consisting of 'H', an object of class 'histogram' and a further list 'B', actually representing the values returned by 'hist' and 'boxplot'.

H$breaks the n+1 cell boundaries (= 'breaks' if that was a vector).
H$counts n integers; for each cell, the number of 'x[]' inside.
H$density values f(x[i]), as estimated density values. If 'all(diff(breaks) == 1)', they are the relative frequencies 'counts/n' and in general satisfy sum[i; f(x[i]) (b[i+1]-b[i])] = 1, where b[i] = 'breaks[i]'.
H$intensities same as 'density'. Deprecated, but retained for compatibility.
H$mids the n cell midpoints.
H$xname a character string with the actual 'x' argument name.
B$stats a matrix, each column contains the extreme of the lower whisker, the lower hinge, the median, the upper hinge and the extreme of the upper whisker for one group/plot.
B$n a vector with the number of observations in each group.
B$conf a matrix where each column contains the lower and upper extremes of the notch.
B$out the values of any data points which lie beyond the extremes of the whiskers.
B$group a vector of the same length as 'out' whose elements indicate which group the outlier belongs to.
B$names a vector of names for the groups.
B$names a vector of names for the groups. a vector of names for the groups. a vector of names for the groups. a vector of names for the groups. a vector of names for the groups. a vector of names for the groups. a vector of names for the groups. a vector of names for the groups. a vector of names for the groups. a vector of names for the groups. a vector of names for the groups. a vector of names for the groups. a vector of names for the groups. a vector of names for the groups. a vector of names for the groups. a vector of names for the groups. a vector of names for the groups. a vector of names for the groups.

Note

The input vector 'data' can contain 'NA's and 'NaN's, which are ignored and left out during the computation.

See Also

'hist', 'boxplot'

Examples

c<-rnorm(1000)
par(mfrow=c(2,2))
f.hist(c)    # plots all three plots
f.hist(c, box=FALSE, scatter=FALSE)    # plots just a histogram
f.hist(c, scatter=FALSE)    # plots a histogram and a boxplot
f.hist(c, box=FALSE)    # plots a histogram and a scatterplot

stat<-f.hist(rexp(500), P.plot=FALSE)
stat$H$breaks   # vector with the cell boundaries of the histogram
stat$B$out   # vector with the data points which lie beyond the whiskers

[Package DASplusR version 0.0-2 Index]