makeSubset {DASplusR}R Documentation

DASData - subsets

Description

The function 'makeSubset()' adds subsets to a data object of class 'DASData' or 'data.frame'. A selected 'data.frame' will additionally become of class 'DASData'.

Usage

makeSubset(dat,subname,ind)

Arguments

dat a data object of class 'DASData' or 'data.frame', which should be enriched with a subset (and, in case, other attributes)
datname the name of the dat object in the .GlobalEnv environment
subname character; a string specifying the name for the subset to create
ind one of:
*
a 0/1 vector of length of 'nrow(dat)', where one's represent the selected rows,
*
a T/F logical vector (NO factor) of length of 'nrow(dat)', where T's represent the selected rows,
*
a vector of rownames as character strings,
*
a vector of integers each identifying a row to be part of the subset,
*
an expression as character string, giving a condition, e.g. on columns of 'dat', which restricts the number of rows.

In any case the index information will be stored as integers identifying rows.
overwrite boolean; a flag giving the function the right to overwrite existing subsets called 'subname'

Details

The function adds 'attributes' to the data object, if they do not exist already. The information added can be accessed via 'attributes(dat)'.

If 'ind' is a character expression giving a condition on columns, with usage of type makeSubset(dat,subname,ind="col1<15") the data object is automatically attached (see '?attach' for details) if a column name is referred directly as in makeSubset(dat,subname,ind="col1<15").

The standard usage would be as makeSubset(dat,subname,ind="dat$col1<15") or makeSubset(dat,subname,ind="dat[,1]<15")

The flag 'overwrite' is FLASE by default and can be set to TRUE if an existing subset named 'subname' is to be replaced by the one specified through 'ind'. In case there are more subsets called 'subname' the first one in the occurence is replaced if 'overwrite' is set to TRUE. If 'overwrite' is FALSE and there is an existing subset named 'subname', an error message appeares, there are no changes applied.

As mentioned above, the subset index information is stored as integers identifying rownumbers. The common usage for getting subsets as data objects would therefore be (as script) dat[attributes(dat)$subsets$subname$indices,].

The generated data object of class 'DASData' with subsets is written to the working environment ('.GlobalEnv') during the execution of the function. Therefore, to examine the changes, you have to look for a object names identically like 'dat' in the working environment. See the examples for details.

Value

The function returns TRUE or FALSE regarding the success of the workflow. The new object is copied to the working environment and is visible there.

Note

There are no missing values allowed, the overwrite flag is optional and by default FALSE. An empty subset-attribute is an empty list (list()).

Author(s)

Stefan Schnabl stefan.schnabl@gmail.com

See Also

'removeSubsets', 'selectSubsets', 'makeDASData', 'attributes', 'attr'

Examples

# example data object of class data.frame
attach(test)
makeSubset(test,"HighNatrium","Na>4")
makeSubset(dat,"Rows",1:3)
makeSubset(dat,"Rows2",c(1,3,5))
makeSubset(dat,"SpecialRows",c(T,F,T,F,T))
makeSubset(dat,"SpecialRows2",c(0,1,1,0,0))
makeSubset(dat,"Rownames",c("1","2"))

# access subsets
dat[attributes(test)$subsets$HighNatrium$indices,]

# delete specific subsets
removeSubsets(test,c(1,3,5))

# delete all
removeSubsets(test,"all")

[Package DASplusR version 1.0-1 Index]