## Quiz 06: Policies and Information

### Location, Date, and Time

Conflicts: There will be no conflict quiz as students are able to choose the time and date of their quiz.

### Quiz Content

All quizzes are cumulative. Previous material can reappear on a later quiz.

7.3: Hypothesis Testing

• Testing Frameworks:
• Why do we use hypothesis testing?
• What is a sampling distribution and how are they used in hypothesis testing?
• What are the four stages of a hypothesis test?
• Ides of Testing:
• How do research questions inform the null hypothesis and alternative hypothesis?
• What are the kinds of assumptions are required to use a hypothesis test?
• Does having more numbers after the decimal place in a test statistic guarantee more accurate results?
• Unpaired (Two-sample) t-Test:
• Is a numeric number more “trustworthy” if there lots of digits after a decimal point?
• How is a test statistic used when computing a p-value?
• Do we have to compute a p-value to make a decision?
• What are some misconceptions about p-values?
• One proportion z-test:
• What happens if a probability value needs to be assessed?
• How does the underlying assumptions between probability and samples differ?

8.1 Exploratory Data Analysis

Provided with: Data Visualization Cheatsheet

• EDA
• What are the different types of EDA?
• Should you use all the types of EDA for every data set or only use one specific type?
• How does EDA look at patterns in the data?
• How should we detect, analyze, and communicate different patterns in the data?
• How is variation similar to covariation? How are they different?
• How can you detect if the data is “overfit”?
• Graphing Systems in R
• What are the three different graphical systems in R?
• When should each be used in the course of an analysis?
• Why are the graphical systems similar to a “goldilocks” scenario?
• Grammar of Graphics
• How does the Grammar of Graphics theory inform the construction of graphs?
• How are layers specified in ggplot2?
• Explain the significance of:
• Aesthetics
• Geometric Objects
• Coordinate Systems
• Facets
• What are the difference between a local and global aesthetic as it relates to geometries?
• What are the benefits to using a script to save a graph vs. using a GUI?
• Making Graphs
• If we have a discrete variable, then what kind of graph should we select?
• Given a numerical response with a categorical explanatory variable, how should the data be displayed visually?
• How is a histogram different than a bar plot? Why is this the case?
• What is the difference between a facet_wrap() and a facet_grid()?
• Themes
• Why should a company design a theme for all of their visualizations?

8.2: Tidy data

Provided with: Data Import Cheatsheet

• Pipe Operator
• What is the pipe operator (%>%) read as?
• Do all functions have to receive data in their first argument to be used with the pipe operator?
• Tidy Data
• What is the Anna Karenina Principle? How is it related to tidy data?
• List the three tenets of tidy data.
• Describe the five common “messy” forms of data.
• How does one of these forms relate to frequency tables (e.g. contingency tables)?
• How are tidy tenets aligned with “long” and “wide” data?
• Why is a tidy data set powerful in a data analysis?
• Tidy Transformation
• How do we move from a “wide” data set to a “long” data set and vice versa?
• How is separate and unite paradigms related to the tidy tenets?

• Designing a Graphic
• What role does a graphic play in expressing the data’s narrative?
• How does the Simpson’s Paradox yield alternative explanations for data when shown on a graph?
• Why does apophenia require a more methodological approach to interpreting patterns on a graph?
• How can graphs be used to lie with statistics?
• CRAP
• What paradigm can be used to assess the quality of graphics?
• Why do we want to have a framework for assessing a graphic?
• How is CRAP related to Gesalt tenets?
• What kind of techniques can be used to distort the meaning of the data?
• Why are pie charts a poor choice to visualize data?
• Chart Junk
• What kind of effect does chart junk have on the interpretation of a graph?
• How is chart junk related to the “Lie” Factor?
• What does it mean if the Lie factor is greater than 1? What if it is less than 1?
• Modern Graphics
• What are the different types of graphics?
• How are animated graphics created?
• Where should animated graphs be used?
• Why are interactive graphics useful?

9.2 Shiny

Provided with: Shiny Cheatsheet

• Interactivity and Shiny
• Compare and contrast a shiny application vs. a static text report.
• Why are shiny applications useful to lower the barrier of entry for using a statistical method in R?
• Components
• How does the Server component differ from the UI component?
• Why is the server often described as the “backend” and the UI the “frontend”?
• What are the benefits of creating a Shiny application?
• Sketching Apps
• What are the different types of prototypes?
• How would a Shiny app fall within the prototype range?
• When should one prototype method be used over another?
• UI
• How are UI objects rendered in a web browser?
• When should paginated layouts be used?
• What the differences between an Output Area and Control Widgets?
• Why do all shiny elements require an identifier (id)?
• Server and Reactivity
• How are Output Areas and Output recipes related?
• What is reactivity?
• Why is reactivity similar to what is found in Excele?
• How is reactivity present in a Shiny App?

### Materials Needed

• Preferably, a rested mind and non-broken hands that can type.

### Policies

• All answers must be reasonably simplified.
• Decimals answers must contain two significant digits.
• Grading will be done as follows:

If you have a technical issue while answering questions or need assistance with opening or starting the quiz, please alert the proctor.

Do not leave the CBTF without filing an issue with the proctor if something goes wrong.

### DRES

Have a testing accommodation? Please see how the CBTF handles Letters of Accommodation.

The short version: Please bring a copy of the Letter of Accommodation to the CBTF Proctors prior to the test taking place.

In short, don’t cheat. Keep your eyes on your own quiz. Do not discuss the quiz with your friends after you have taken it. Any violation will be punished as harshly as possible.

The best way to study for a STAT 385 quiz is by writing and reading code. Try to take an idea in STAT 385 and apply it to your own work.

With this being said, there are three other resources that may assist your studies:

• Topic Outline (Above)
• Lecture Code
• Homework

Again, the best way to study is to do programming in some fashion. Whether that be writing code or explaining how code works to someone else.

Consider using resources such as:

1. RStudio Cloud Primers for interactive practice.
2. Exercise problems listed in a given section of the readings.

Do not spend time memorizing lecture slides. You will not see any verbatim questions.

Do not try pulling an all-nighter. You can schedule your quiz anytime between a time window. To program efficiently, you need sleep despite the quote:

“Programmers are an organism that turns caffeine into code.”

#### What kind of question types are on the quiz?

There are generally four types of problems:

• True / False
• Multiple Selection (e.g. select ALL correct answers from a list)
• Fill in the blank
• Writing Code

#### How many problems are on the quiz?

Only one question with 15012391 subquestions. In all seriousness, do not fixate on a number. There will be a reasonable amount of questions for the time period.

#### How long will it take to do the quiz?

Depending on your background, the quiz may take:

• Prior R in-depth experience: 25 minutes
• Some R experience: 35 minutes
• No R experience: 50 minutes

Avoid fixating on time. Life will come and go more quickly than you realize. Focus more on the content.

#### When will the quiz be returned?

As all problems are automatically graded, we should be able to post the quiz results after the examination window closes.

No.

#### We got our grades back, now will the quiz be curved?

No. Curving is only done sparingly at the end of the semester. Individual assignments are not modified.