Data AnalysisYour submitted document should include the following items. Points will be deducted if

data_analysis_assignment_4.docx

sample_solution_for_formatting.docx

Don't use plagiarized sources. Get Your Custom Essay on
Data AnalysisYour submitted document should include the following items. Points will be deducted if
Just from \$13/Page

Unformatted Attachment Preview

STAT 250 Summer 2019 Data Analysis Assignment 4
Your submitted document should include the following items. Points will be deducted if the
following are not included.
1.
Type your Name and STAT 250 with your correct section number (e.g. STAT 250-xxx)
right justified and then Data Analysis Assignment #4 centered on the top of page 1
2.
3.
corresponding number and subpart. Keep the answers in order. Do not include the
4.
Generate all requested graphs and tables using StatCrunch.
5.
6.
You may not work with other individuals on this assignment. It is an honor code
violation if you do. In addition, using materials for a previous semester of STAT 250
(whether your own or someone else’s) is cheating.
Elements of good technical writing:
Use complete and coherent sentences to answer the questions.
Graphs must be appropriately titled and should refer to the context of the question.
Graphical displays must include labels with units if appropriate for each axis.
Units should always be included when referring to numerical values.
When making a comparison you must use comparative language, such as “greater than”, “less
than”, or “about the same as.”
Ensure that all graphs and tables appear on one page and are not split across two pages.
Type all mathematical calculations when directed to compute an answer ‘by-hand.’
Pictures of actual handwritten work are not accepted on this assignment.
When writing mathematical expressions into your document you may use either an equation
editor or common shortcuts such as:
x can be written as sqrt(x), p̂ can be written as p-hat, x
can be written as x-bar.
1
Problem 1: Appropriateness of Inference
For the following scenarios, answer the questions for each part. In each part, the underlined text
is the name of the StatCrunch data set to be used for that part. Please note, do not conduct
inference in either of these parts; just answer each question.
a) Food Prices: Target versus Safeway. Grocery prices of the same randomly selected
items were collected and compared from Target and Safeway. Imagine you were
interested in conducting a hypothesis test to determine whether the mean prices were
significantly different. Note: to answer the questions below, subtract Target price –
Safeway price (i.e. subtract Safeway price from Target price).
i) What is (are) the parameter(s) of interest? Choose one of the following symbols
( (the mean of one sample) D (the mean difference from a paired (dependent)
samples)  − 2 (the mean difference of two independent samples) and describe the
parameter in context of this question in one sentence.
ii) Depending on your answer to part (i), construct one or two relative frequency
histograms. Remember to properly title and label the graph(s). Copy and paste these
iii) Describe the shape of the histogram(s) in one sentence.
iv) Depending on your answer to part (i), construct one or two boxplots and copy and
paste these graphs into your document.
v) Does the boxplot (or do the boxplots) show any outliers? Answer this question in one
sentence and identify any outliers if they are present.
vi) Considering your answers to parts (iii) and (v), is inference appropriate in this case?
Why or why not? Defend your answer using the graphs in two to three sentences.
b) GMU Health Center Waiting Time. During the flu season, it is known that the waiting
time at the GMU Health Center can be extreme. A statistics student wanted to test her
claim that the wait time was greater than 100 minutes. She took a random sample of wait
times during the flu season and recorded them in StatCrunch.
i) What is (are) the parameter(s) of interest? Choose one of the following symbols
( (the mean of one sample) D (the mean difference of two paired (dependent)
samples)  − 2 (the mean difference of two independent samples) and describe the
parameter in context of this question in one sentence.
ii) Depending on your answer to part (i), construct one or two relative frequency
histograms. Remember to properly title and label the graph(s). Copy and paste the
iii) Describe the shape of the histogram(s) in one sentence.
2
iv) Depending on your answer to part (i), construct one or two boxplots and copy and
paste these graphs into your document.
v) Does the boxplot (or do the boxplots) show any outliers? Answer this question in one
sentence and identify any outliers if they are present.
vi) Considering the answers provided in parts (iii) and (v), is inference appropriate in this
case? Why or why not? Defend your answer using the graphs in two to three
sentences.
Problem 2: GPA of Students Depending on Where They Sit.
A professor wanted to know whether there was a difference in students’ grade point averages
(GPA) depending on whether they sit in the front half of the classroom versus the back half of
the classroom. In a previous semester, a random sample of students was selected from the front
of a classroom and another random sample was selected from the back of a classroom and the
student’s current GPA was recorded. The data provided in StatCrunch represent the GPAs from
each random sample. The file is called “GPA Versus Seating Location.” At the 0.01
significance level, can the professor conclude from these data that the mean GPA for front sitters
is higher than back sitters? Assume all conditions for conducting inference are satisfied.
Conduct a full hypothesis test by following the steps below. Enter an answer for each of
a) Define the population parameter of interest in context of this question in one
sentence.
b) State the null and alternative hypotheses using correct notation.
c) State the significance level for this problem.
d) Calculate the test statistic in StatCrunch using STAT → T Stats → 2 Sample →
With Data. Copy and paste the output table into your document.
e) Label the p-value seen in your output table produced in part (iv) using the
probability notation (it begins with P(…)).
f) State whether you reject or do not reject the null hypothesis and your reason for
g) State your conclusion in context of the problem (i.e. interpret your results and/or
answer the question being posed) in one or two complete sentences.
Problem 3: Next page
3
Problem 3: Metal Hardness Testing
The manufacturer of hardness testing equipment uses steel-ball indenters to indent metal that is
being tested. However, the manufacturer thinks there might be a difference in hardness reading
when using a diamond indenter. The metal specimens to be tested are large enough so that two
indentations can be made. Therefore, the manufacturer wants to use both indenters on each
specimen and compare the readings. The order of the indentations will be random. This
particular design is called the paired design (or matched pairs design or dependent samples
design). Assume all conditions are satisfied in this problem. The data set used for this problem
is called “Metal Hardness Testing”.
a) Calculate the difference between specimens by subtracting Steel Ball – Diamond. For
example, the first difference is 51 – 52 = -1. List the difference for each of the 14 pairs in
b) For the first piece of metal, which indenter produced the larger hardness reading?
Answer this question in a complete sentence.
c) Obtain the mean of these differences and the standard deviation of these differences in
StatCrunch. You may copy and paste the box that you obtain from StatCrunch or list the
values. Please round these values to four decimal places.
d) Construct a 95% confidence interval using the above data. Please do this “by hand”
(found in the last page of our formula packet) to obtain your t* critical value needed for
the confidence interval. Present this confidence as (lower limit, upper limit)
e) Use StatCrunch to obtain a 95% confidence interval for the above data by selecting:
Stat → T Stats → Paired. Enter Steel Ball for Sample 1 and Diamond for Sample 2.
f) Does your confidence interval capture 0? Answer this question and briefly explain what
this implies in one or two sentences in the context of the question.
g) Using your answer to part (g), imagine you were using a hypothesis test to determine if a
significant difference exists in mean hardness reading between the two indenters (the
hypotheses would be H0: D = 0 vs Ha: D ≠ 0). What decision and conclusion can be
made in this case? Provide an answer and a reason for your choice in one or two
run this hypothesis test).
Problem 4: Next page
4
Problem 4: Lego Prices
The data set named “Lego Prices” contains a selection of Lego sets sold on the Lego website in
August 2016. The goal of this problem is to explore one variable (the number of Pieces a set
contains) that may help a buyer predict the price of a Lego Set. The Price variable is the
response variable in this problem.
a) Investigate the relationship between the explanatory variable “Pieces” and response
variable “Price” by doing the following:
i) Make a scatterplot and copy and paste it in your solutions (use Graph → Scatter
Plot in StatCrunch).
ii) Calculate the correlation coefficient (use Stat → Summary Stats → Correlation in
StatCrunch). Provide this value in your document.
iii) Interpret the scatterplot and correlation coefficient in terms of trend, strength, and
shape (form) in one complete sentence.
b) Using the “Pieces” variable as the explanatory variable, run a Simple Linear Regression
analysis in StatCrunch. Use Stat → Regression → Simple Linear. Copy and paste only
the StatCrunch results output (no tables).
c) Add the fitted line plot to your document. This graph appears on page 2 of your output.
d) Type the regression equation into your document.
e) Interpret the slope of the regression line (in context of this data set).
f) Is it meaningful to interpret the y-intercept? Why or why not?
g) State r-squared (i.e., the coefficient of determination) and explain what this value means
in context of the data set.
h) Use the regression equation from part (d) to predict the price of a randomly selected set
containing 556 pieces. State your predicted value in a sentence that is in context of the
data. Do not forget to mention the units. Note: You can do this calculation “by hand” or
using StatCrunch.
i) Is your prediction in part (h) an example of extrapolation? Why or why not?
5
1
Sample Solution to Display Formatting
A random sample of 30 students was selected from a STAT 250 course taught during the
summer session and their first exam scores were recorded.
a) Create a histogram in StatCrunch. Be sure to title and label it correctly.
b) Interpret the histogram’s shape
See sample solution and formatting on page 2.
Following the main points will help you submit a professionally completed assignment.
1)
2)
3)
4)
Right justify your name and provide your correct section and the due date.
Center the specific homework assignment title.
Bold each problem complete problem number.
The graph can be around the below size for readability (click on the graph once and only
adjust the size of the graph by using the bottom right dot)
keep the assignment in problem and part order (present 1a, then 1b, and so on).
2
Kenneth Strazzeri
Data Analysis Assignment 1
Problem X
a)
b) The shape of this distribution is left skewed because I see the majority of the data values
falling in the upper end of the distribution and a few 50s and 60s skewing the shape. There does
not seem to be any outliers visible on the graph.

attachment

Basic features
• Free title page and bibliography
• Unlimited revisions
• Plagiarism-free guarantee
• Money-back guarantee
On-demand options
• Writer’s samples
• Part-by-part delivery
• Overnight delivery
• Copies of used sources
Paper format
• 275 words per page
• 12 pt Arial/Times New Roman
• Double line spacing
• Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.