Description
Usage
Arguments
Details
Value
References
Examples
View source: R/calcDependence.R
Calculate dependence with a target variable
dd 
An object of type DDDataSet

method 
Algorithm to use. Valid values are:
ncpc  Neighbourhood Consistent PC algorithm
ncpc*  Neighbourhood Consistent PC algorithm star version
hc  Hillclimbing with custom penalty functions
hcbic  Hillclimbing with BIC penalization (package bnlearn)
hcbde  Hillclimbing with BDe penalization (package bnlearn)
iamb  IAMB algorithm (package bnlearn)
fast.iamb  FastIAMB algorithm (package bnlearn)
inter.iamb  InterIAMB algorithm (package bnlearn)
pc  PC algorithm (package pcalg)
mmpc  MMPC algorithm (package bnlearn)
mmhc  MMHC with custom penalty functions
mmhcbic  MMHC with BIC penalization (package bnlearn)
mmhcbde  MMHC with BDe penalization (package bnlearn)

... 
Extra parameters passed to backend functions ncpc() , plotBNLearn()
and plotPCAlgo() depending on the picked algorithm (parameter method ).
Extra parameters for ncpc and ncpc*:
alpha  the alpha (Pvalue) cutoff for conditional independence tests (default: 0.05)
p.value.adjust.method  the multiple testing correction adjustment method (default: "none")
test.type  the type of conditional independence test (default: "mcx2c"). See the documentation
for ciTest for available conditional independence tests
max.set.size  the maximal number of variables to condition on, if NULL
estimated from number of positives in class labels. Needs to be specified for
continuous data. (default: NULL)
mc.replicates  the number of MonteCarlo replicates for the conditional independence
test, if applicable (default: 5000)
report.file  name of the file where a detailed report is to be printed,
reporting is suppressed if NULL (default: NULL)
verbose  if to print out information about how the algorithm is progressing (default: TRUE)
min.table.size  the minimal number of samples in a contingency table per conditioning set
(applicable only for discrete data) (default: 10)
Extra parameters for hc, mmhc:
score  score function to use, accepts all from bnlearn package. For discrete
data: "loglik", "aic", "bic", "bde", "k2". For continuous: "loglikg", "aicg", "bicg",
"bge". For more details see help page for packagebnlearn .
make.plot  if to make a plot or just return the network (default: FALSE)
blacklist  a data frame with two columns (optionally labeled "from" and "to"),
containing a set of arcs not to be included in the graph. (default: NULL)
restart  the number of random restarts for scorebased algorithms (default: 0)
scale  the colour scaling (default: 1.5)
class.label  the label to use for the target variable (default: "target")
use.colors  if to colour code the enrichment/depletion in a plot (default: TRUE)
Extra parameters for hcbic, hcbde, mmhcbic, mmhcbde:
make.plot  if to make a plot or just return the network (default: FALSE)
blacklist  a data frame with two columns (optionally labeled "from" and "to"),
containing a set of arcs not to be included in the graph. (default: NULL)
restart  the number of random restarts for scorebased algorithms (default: 0)
scale  the colour scaling (default: 1.5)
class.label  the label to use for the target variable (default: "target")
use.colors  if to colour code the enrichment/depletion in a plot (default: TRUE)
Extra parameters for iamb, fast.iamb, inter.iamb, mmpc:
make.plot  if to make a plot or just return the network (default: FALSE)
alpha  the alpha value of conditional independence tests (default: 0.05)
test  the type of conditional independence test (default: "mcmi"). For
conditional independence tests available consult the bnlearn package help
page (?bnlearn).
B  the number of MonteCarlo runs for conditional independence tests,
if applicable (default: 5000)
blacklist  a data frame with two columns (optionally labeled "from" and "to"),
containing a set of arcs not to be included in the graph. (default: NULL)
scale  the colour scaling (default: 1.5)
class.label  the label to use for the target variable (default: "target")
use.colors  if to colour code the enrichment/depletion in a plot (default: TRUE)
Extra parameters for pc:
alpha  the alpha value cutoff for the conditional independence tests (default: 0.05)
verbose  if to show progress (default: FALSE)
directed  if TRUE applies PC algorithm, if FALSE applies PCskeleton (default: TRUE)
make.plot  if to make a plot of the final inferred network (default: FALSE)
scale  the scaling parameter for colorcoding (default: 1.5)
indepTest  the independence test wrapper function (default: mcX2Test).
The following functions are available: mcX2Test (a wrapper around mcx2c (Monte Carlo X2 test)
with B=5000), mcX2TestB50k (a wrapper around mcx2c (Monte Carlo X2 test) test with B=50000),
mcMITest (wrapper around mcmi test from bnlearn with B=5000).
The package pcalg additionally provide following tests:
binCItest for binary data (performs a G^2 test) and gaussCItest for continuous data
(performs Fisher's Z transformation), dicCItest for discrete data (performs G^2 test).
class.label  the label to show for target variable (default: "target")
use.colors  if to colour code the results (default: TRUE)

This function is a frontend convenience function to access predictions of direct
dependence with a target variable by various Graphical Modelling algorithm.
Consider a set of variable X_1, ..., X_m and a target variable T. We say that
that X_i is directly dependent with T if there is no other set of variable X_j, X_k, ...
such that it renders X_i conditionally independent of T. In other words,
X_i is the most immediate casual cause/consequence of T in the set of chosen variables.
Note that the above statement is different from that of classical feature selection for
classification. A set of features obtained with feature selection have the property that
a good classifier can be made based on them alone, while the above statement establishes
statistical properties of variables. The set of variables with direct dependence
might not be optimal for classification, since classification performance can be
strongly influenced by false negatives (Friedman et al, 1997).
A list with elements:
obj  the resulting object, either of class DDGraph
for ncpc and ncpc* algorithms, or
of class bn
for bnlearn
algorithms, or
of class pcAlgo
for PC algorithm.
nbr  the variables with direct dependence (i.e. target node neighbourhood in the causal graph).
For both ncpc and ncpc* includes variables with direct and joint dependence.
mb  the variables in Markov Blanket of target variable. Not applicable for ncpc algoritm. For ncpc*
algorithm includes variables with direct, joint and conditional dependence.
labels  for ncpc and ncpc* contains the set of labels that are output of the algorithm.
Nir Friedman, Dan Geiger, and Moises Goldszmidt, "Bayesian Network Classifiers",
Machine Learning 29 (November 1997): 131163.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26  # load in the data for fly mesoderm
()
# increase alpha to 0.1, suppress progress output
($VM, "ncpc", alpha=0.05)
# run ncpc* with mutual information with shrinkage and minimal numbers of
# samples per conditioning set of 15
($VM, "ncpc*", test.type="mish", min.table.size=15)
# run PC algorithm using the G^2 test from pcalg package
($VM, "pc", indepTest=pcalg::binCItest)
# run hillclimbing with BIC penalization and plot the resulting Bayesian Network
# NOTE: plotting requires the Rgraphviz package
(("Rgraphviz"))
($VM, "hcbic", make.plot=)
# continuous data example
()
# run ncpc with linear correlation test and with maximal conditioning set of 3
res < ($VM, "ncpc", max.set.size=3, test.type="cor")
# plot the resulting ddgraph with colours
(("Rgraphviz"))
(res$obj, =)

ddgraph documentation built on Nov. 17, 2017, 10:50 a.m.