Key Terms
Algorithms
Resampling
Plots
100

THE NUMBER THAT OCCURS MOST FREQUENTLY IN A DISTRIBUTION

MODE

100

WHAT ALGORITHM USES R^2 AS A METRIC?

REGRESSION

100

WHAT IS THE BASIC TRAIN/TEST SPLIT?

80/20

100

WHAT PLOT GROUPS BY RELATIONSHIP?

DENDROGRAM

200

DATA POINTS THAT FALL IN A SYMMETRICAL, BELL SHAPED CURVE

NORMAL DISTRIBUTION

200

WHAT ALGORITHM OFTEN USES LOGLOSS AS A METRIC?

LOGISTIC REGRESSION

200

HOW MANY TIMES IS CROSS VALIDATION COMPUTED RUN?

K TIMES

200

What are the five values need to make a box-and-whisker plot?

Minimum, 1st quartile, median, 3rd quartile, maximum

300

HOW SPREAD OUT THE DATA ARE; I.E. HOW DIFFERENT THE VALUES ARE

VARIABILITY

300

WHAT ALGORITHM IS APPLIED IN DEEP LEARNING?

NEURAL NETWORKS

300

WHAT IS K USUALLY?

5 OR 10

300

WHAT WERE THE DENDROGRAM BRANCHES CALLED?

CLADES

400

HOW TO DESCRIBE A RELATIONSHIP WHEN HIGH LEVELS OF ONE VARIABLE ARE RELATED TO LOW LEVELS OF ANOTHER VARIABLE

NEGATIVE RELATIONSHIP

400

WHAT ALGORITHM USES RULES TO CLASSIFY?

DECISION TREES

400

WHAT'S THE TECHNICAL NAME OF THE ERROR RATE?

MISCLASSIFICATION RATE

400

WHAT PACKAGE IS USED IN R FOR DENDROGRAMS?

GGDENDRA

500

RANGE OF VALUES WITHIN WHICH THE PARAMETER IS ESTIMATED TO BE, AT A SPECIFIED PROBABILITY, E.G., 95% CI

CONFIDENCE INTERVAL

500

WHAT ALGORITHM ENSEMBLES RULES?

RANDOM FOREST

500

WHAT ARE THE STEPS OF CROSS-VALIDATION?

1) DIVIDE THE OBSERVATIONS INTO K GROUPS OR FOLDS 

2) THE FIRST FOLD IS TREATED AS A VALIDATION SET 

3) PROCEDURE IS REPEATED

500

List at least 2 libraries in R that is used for data visualization

ggplot2, Lattice, Leaflet, Highcharter, RColorBrewer, plotly, sunburstR, RGL, dygraphs

M
e
n
u