Home » 2013 » February

Monthly Archives: February 2013

Relationship between catagorical variables

the way we explore relationships depends on how the variables have been measured.

pearsons correlation co efficient to explore relationships between metric values

t tests to explore relationships between one metric and one catagorical value

relationship between catagorical values

crosstabulations

produce a table called crosstabulation

DV goes in rows and IV goes in columns

some results can be misleading eg if there were more in one group then another.

to compare look at the percentages and not numbers of the groups.

cells = values in rows

rows = rows of results

totals are usually calculated

calculate percentages: value 1 / value 2 x 100 = percentage

Describing the relationship between two variables

identify what the two variables are

how are they measured.

how we expect the relationship to look

Analysis:

crosstabulation

DV in rows IV in columns

calculate column percentages.

use the comparison of the variables to describe the relationship.

How to:

Producing crosstabulations and chi square statistics.

analyse

descriptive statistics

crosstabs

transfer IV into column and DV into row

select column in cells then ok

for x2 statistic click statistics

select chi square

click continue then ok

start with hypothesis or question

what are the two variables and how are they measured.

speculate how the relationship will look

who is being compared and what is the IV

produce crosstabulation

request column percentages

select one row and compare percentages

description starts with overview sentence

second sentence compares percentages

From sample to population – the x2 test

use the X2 test to test for significance

crosstabs procedure can be used to produce X2 statistic and p value 

we use pearsons chi square statistic value

which reads X2 degrees of freedom (df) and p value

chi statistics is always a non directional form

writing the report

structure of report:

research aim or hypothesis

info about the sample

if relationship significant then description of relationship in sample

explicit statement about whether relationship is sig or not.

the X2 (chi) statistics

statement relating back to findings

only focus on one aspect of DV (row) 

a) give sentence with no percentages

b) say which groups are likely to fall into this catagory.

c) sentence comparing all percentages in this row of the table

d) only refer to other rows if they substantially add to your understanding.

example 1

“It was hypothesised that people who use generic brands of toilet paper are more likely to scrunch than people who use Kleenex or Quilton.
In a sample of 71 students, those who used generic brand toilet paper were the most likely to scrunch followed by users of Kleenex. While 60% of generic brand users scrunched, 31% of Kleenex users and only 16% of Quilton users scrunched. The relationship between brand of toilet paper and scrunch/fold choice was significant, χ2(2) = 8.75, p = .013.
As expected, users of generic brand toilet paper are more likely to scrunch than users of Kleenex or Quilton.”

Example 2

“It was suggested that people with a tertiary education are less likely to own a Mac home computer than people with a secondary education.
In a sample of 200 people who own a home computer, those with a tertiary education were actually more likely to own a Mac than those with a secondary education. While 26% of participants with a tertiary education owned a Mac, only 13% of participants with a secondary education owned a Mac. The relationship between education level and type of computer owned is significant, χ2(1) = 5.31, p = .021.
Contrary to expectations, people with a tertiary education are more likely to own a Mac than people with a secondary education.”

parametric and non parametric tests

all significance tests estimate population paremetre.

in each case we ask whether the sample statistic is significantly different from some reference value. 

pearsons r = sample correlation sig > 0

one sample t test = sig higher then ref value

independent samples t test = difference in sample mean sig > 0

95% confidence interval that difference between means lies between two limits.

chi square is non parametric 

x2 does not measure strength of relationship

does not measure strength of relationship in the population.

it allows us to test if sample relationship is strong enough to convince us that there is a relationship in population. (test significance)

dosent allow us to estimate strength of population relationship.

Basis for the X2 test.

based on what we observe against what we expect.if there was no relationship.

the X2 looks at each cell in the table and compares observed and expected frequencies.

to calculate X2 we calculate the difference square the difference and divide by expected frequency.

add up the contributions from each cell.

the X2 does not measure strength 

the more cells in table and participants in study the bigger the value of X2

unlike pearsons r X2 does not describe the relationship in the sample its only for significance

Degrees of freedom

when reporting X2 always report df

df equation

“df = (r – 1)␣(c – 1), where r is the number of rows and c is number of columns

 

 

 

 

 

which analysis do i use?

independent samples

paired samples

correlation/regression

when looking at the variables and questions questions which ask you to compare things that are on the same scale are sensible where things that are not on the same scale (or measured in a similar way) do not make sense.

secondly we need to be clear if we are asking is this bigger then that or if we are asking does bigger this mean more of that. if its the first we can measure on same scale if not then we need to measure on a different scale.

“1. Do people take longer [DV] to drive to work [IV(a)] than they take to drive home from work [IV(b)]. [metric] Paired samples T test as be are comparing two things against each other on same scale. (rectangle)

2. Do people who take longer to drive to work also tend to take longer to drive home from work? (does more of this = more of that) metric and we are looking at the relationship between the two therefore correlation and regression.(circle)

3. Do people think that appearance is more important than feeling good? (asking people to compare two variables like Q1 rather then the relationship between them) (rectangle) once again this is a paired samples t test as we want to compare these two things.

4. Do people who think that appearance (V) is more important tend to take longer to get ready for work? (V) (has two variables but they are on different scales so cant be compared directly) instead of comparing we are co – relating. (circle) correlation and regression analysis.

5. Do full time workers take longer to get to work than part time workers? (compares two different groups of people) status V time. because we have a solid variable ie time taken and want to measure two different groups against that (triangle) we need to do a independent samples T test.

6. Do males think appearance is more important than females?” (compares equal measures of each against appearance) (triangle) therefore independent samples T test.

“a. Do older people tend to be more interested in cricket? (circle) Whats their interest? CRA
b. Do people who spend more time playing sport tend to be more interested in football? (Circle) whats their interest CRA

c. Are people more interested in cricket than in football? Rectangle PSTT

d. Are males more interested in ballet than females? Triangle ISTT

e. Do people spend more time watching TV than they spend playing sport? Rectangle PSTT
f. Do people who are more interested in ballet tend to be less interested in football? (circle) CRA

g. Do people who are members of a football club tend to be more interested in football than
people who are not members? Triangle ISTT

h. Do older people tend to spend less time playing sport?” Circle CRA

 

 

Writing the report

when exploring the relationship between two metric variables

look at scatterplot to check no curve outliers or subgroups

if curved then all we can do is describe the pattern on the scatterplot

where linear we can

1) do pearsons R ; and

2) regression statistics.

what to put in report:

start with research question or hypothesis

information about sample

direction and strength of sample relationship.

significant or not with supporting statistics r n and p values

if relationship significant the 95% confidence interval (pearsons r)

if significant and familiar interpretation of slope

finish by relating results back to question or hypothesis.

-can include regression coefficient (slope) 

if scales are familiar like height weight etc then interpreting slope gives a useful indication of what the relationship looks like.

if the scales are unfamiliar then interpreting the slope wont help the reader understand what the relationship looks like.

*r2 value has not been added to report

co efficient of determination is helpful in deciding if the relationship is an important one and is usually included only in more detailed studies.

example 1

“It was hypothesised that people with more experience would tend to have higher salaries.
In a random sample of 30 Australians with full time employment, there was a moderate strength, positive, linear relationship between length of experience and weekly salary, and Pearson’s r shows that this relationship is significant, r = .65, n = 30, p < .001. The 95% confidence interval for Pearson’s correlation indicates that the strength of the relationship is between ρ = .32 and ρ = .54. In the sample, for each additional year of experience, on average, respondents earned an extra $28.06 per week. As expected, people with more experience tend to earn higher salaries.”

Example 2:

“It was hypothesised that students with higher IQ would tend to have better exam scores.
In a random sample of 100 students, there was an extremely weak, positive, linear relationship between IQ and exam scores and Pearson’s r indicates that this relationship is not significant, r = .03, n = 100, p = .341.
This study provides insufficient evidence to suggest that there is a relationship between IQ and exam scores.”

Example 3

“It was hypothesised that longer bushwalks would take more time.
In a random sample of 40 bushwalks, there was a weak, negative, linear relationship between distance walked and time taken, and Pearson’s r shows that this relationship is significant, r = –.37, n = 40, p = 0.021. The 95% confidence interval for Pearson’s correlation indicates that the strength of the relationship is between ρ = –.06 and ρ = –.61. In the sample, for each additional kilometre walked, on average, the time taken to complete the walk was 0.05 hours less.
Contrary to expectations, longer bushwalks tend to take less time.”

when inputting data into 95% confidence interval for correlation co efficient the value will be minus if the r value is in a relationship that is negative.

 

 

 

The Regression Equation

Where a correlation its prudent to explore the relationship between the variables.

The strength may be the same but the form may be different (half as much twice as much etc.)

The line of best fit is a summary of the relationship between the two variables.

The equation of this line is known as the regression equation

The equation for a straight line is Y = a+ b x X, where X is the IV and Y is DV. a is a constant known as vertical intercept. b is known as constant slope or regression co efficient.

DV Daughters education

IV Mothers Education

1) DV = 2+ IV = The daughter has on average 2 more years of education then mum

2) DV = 2x IV =  The daughter has twice as much education as mum

3) DV = 0+ 2x IV = If mums education was zero daughters would also be zero but that for each additional year the daughter has on average two more years of education then mum.

4) DV = 0.5 x IV = on average daughters have half as much education as mums. if the mums education was zero the daughters would be zero also but for each additional year the daughter would have 6 months more education then mum.

Two uses:

where we know the value of the IV it allows us to predict the DV.

helps us describe the relationship.

using the regression equation to make predictions.

by drawing data from the slope.

in social sciences when the relationship is weak or only moderate describe and understand relationship rather then predict.

Determining the line of best fit.

to know more about a relationship we can fit a regression line to a scatterplot.

the difference between the actual and the predicted (line) is a residual and the line is chosen to produce the smallest overall residuals possible. (generalised least squares)

Interpreting the SPSS output

Producing linear regression statistics

How to:

analyse

regression

linear

transfer DV and IV into relevant boxes.

then;

statistics

tick confidence intervals (for 95% CI)

click continue then ok

Interpreting results

look at last table co efficients 

first column indicates regression equation.

standard regression equation: 

“Predicted DV = constant + slope × IV”

From sample to population

95% confidence interval.

This information is contained in the same table last column 

 

 

 

 

 

Module 7

Relationships between metric variables

eg do those with more experience earn higher salaries:

to explore the relationship between metric variables start with a scatterplot.

draw a graph and plot the DV on vertical axis and IV on horizontal axis.

*If scatterplot shows no outliers or curve in the relationship then pearsons correlation co efficient.

How to:

Click on graphs in menu bar

Chart builder

Choose from (Scatter dot)

Choose from type of scatterplot (first one is simple)

In element properties ensure that statistics is set to value.

Drag variable tags into correct axis.

To place a line in this graph double click and go to chart editor

click on scatterplot with line icon and a regression line will be fitted to the graph.

Describing Scatterplots

should be described on four characteristics:

Direction, strength, form, outliers.

more of this = more of that = a positive relationship = a upward trend from left to right.

or

more of this = less of that = general downward trend from left to right.

Form

how do the points cluster along the line.

when the relationship can be represented by a line its called linear.

other relationships can be represented as curves its a non linear relationship.

Strength

How closely do the points cluster around the line used to summarise the relationship.

Where theres no curved pattern use summary statistic pearsons correlation.

Outliers

scatterplots are great at showing outliers. some outliers are bivariate outliers.

Pearsons correlation co efficient

where no curved line exists it can be used to objectively measure the strength of the relationship.

pearsons correlation is from -1 to +1 where a perfect line exists the value is +1 or -1 if there is no relationship its zero.

How to:

where no outliers and no curve click on

analyse, correlate and bivariate.

transfer variables into variable box and click ok

value of pearsons correlation is represented in the r = xxx statistic.

guidelines for interpreting the value of pearsons

*- and + are ignored.

.75 or more is strong relationship

.45-.74 is moderate relationship

.25-.44 is weak

.24 is extremely weak

describing the relationship

the coefficient of determination r2

significance differs greatly in the different sciences, the co efficient of determination is used instead. which is the square of the coefficient correlation.

where there is a curved line pearsons underestimates the strength of the relationship.

other reasons to check scatterplot

subgroups and outliers can grossly distort the results.

From sample to population

Testing for significance for pearsons correlation co efficient

to test from the results of the sample back to the whole population we need to do a significance test such as (pearsons)

in a report we should describe the relationship as weak moderate strong, positive or negative and linear or non linear. name the variables, the significance and plain language description of what that means.

95% correlation co efficient

the website provides an applet to measure the 95% confidence interval.

greek symbol p (rho) is used to represent the correlation.

correlation does not always mean causation

when we find a correlation we cant always be certain why the correlation occururred it may be because one variable affects the other or because of a nuisance variable.

spurious relationships

nuisance variables may be affecting both IV and DV

nuisance variable may affect the IV causing changes in the DV

the IV is acting indirectly through mediating variable

Exam intro to research

Preparation for the final exam

Content

  • Exam Structure

    It is a closed book, three hour exam and is worth 60% of your final mark. Students who do not achieve a mark of at least 40% on the exam will not be considered for a pass in this unit.

    The exam is a written paper with no practical component. All SPSS outputs you require will be provided on the exam paper. The exam concentrates on the material in Modules 3 to 8, but competence in the materials in Modules 1 & 2 is assumed. The trial exams should give you a good indication of the type of questions you might expect on the exam.

    As with the assignments, questions assessing the most basic skills are designated with a **. You must get at least 80% of the marks on the ** questions and at least 40% on all of the formal reports in order to pass the exam.

    The exams are organised through OUA Examination Services and by now you should have indicated which examination venue you plan to attend. If you have registered with disabilities support and have special requirements for the exam check with your disabilities officer that this has been organised for you.

  • What can you take into the exam?

    You should take a calculator into the exam (you are not permitted to use transmitting devices, eg. the calculator on your mobile phone, iPads, etc).

    You can also take an A4 sheet of notes. This can be double sided, can be handwritten or printed and can contain any information you wish. It must be a single sheet of paper with no attachments stuck onto it. This page of notes will be collected with the exam paper.

    Please do not use pencil or red pen for your exam answers.

  • Model reports

    Attached Files:

    Attached is an MS-Word document containing all of the model reports from the text which may be relevant on the exam. We strongly recommend that you include some of these model reports on the A4 sheet of notes you can take into the exam with you. Be selective, you don’t need to include all of the models 🙂

  • Exam Marking Criteria

    Attached Files:

    Criterion based marking will be used on the exam. There are certain criteria you must fulfil in order to achieve a pass, a credit etc. The full criteria for the exam are attached here.

    The most important criteria are those for  a pass. You must achieve  at least 80% of the marks on the ** questions and at least 40% on each Formal Report in order to pass. So make sure you  complete all of the Formal Reports and ** questions before spending time on the more advanced questions.

  • Suggestions

    Start your exam preparation by reviewing the assessments you’ve done so far. Review your topic tests and assignments and make sure you understand where you went wrong in any questions you got incorrect.

    Next try some of the revision exercises in the interactive room. Post questions on the discussion board about anything you’re not clear on.

    Construct an A4 sheet to take into the exam. Constructing an A4 sheet of notes is a good way of getting things clear in your mind. Your A4 sheet should definitely include some of the model reports. You might also include other helpful reminders like: Significant if p < .050, or anything you’re worried you might forget.

    Try trial exam 1. Complete it under ‘exam conditions’. That is, see how far you get in three hours and use only your A4 sheet. This will tell you if you need to add anything to your A4 sheet and will also give you an indication of timing. Check your answers against the solutions.

    Brush up on any areas you were unsure of in trial exam 1. YOu can post questions on the discussion board about answers to the trial exams. Complete trial exam 2 and compare your answers to the solutions given.

  • Trial Exams and Solutions

    Attached Files:

    These trial exams are of the same standard as the actual exam and of approximately the same length. However, the questions on the trial exams may assess different aspects of the theory we have covered or assess the same theory in a different way, so the exam will not be a perfect match to these trial exams.

    So don’t base your study on the trial exams. There may be reports on the actual exam which are not included in the trial exams.  You should do your revision first, then try the trial exams. This will give you an indication of the timing on the exam and of how well prepared you are for the exam.

    Note that on both the trial exams and the actual exam there are some very difficult questions – we need some questions to distinguish between our distinction and our high distinction students. The questions towards the end of the paper contain more of the higher order skills.

Rounding

Rounding

The Topic Tests and Assignments always state how we want answers. 

Common sense rounding is the best.  I.e. Talking $ and cents doesn’t ‘seem right’ at 3 decimal places. Dollars and cents is often okay for rounding, but if we were dealing with millions of dollars for some stock market data, not necessary to state the 17 cents for example.

p-values are always 3 decimal places as this is because we are talking about the chances out of 1000 of getting our result.  

t-test stat always 2 decimal places (and drop the negative sign for the report).  If you get told to round to a certain decimal place, follow the instructions as outlined in the additional resources in the Learning Material Folder.  For a lot of our examples 2 decimal places tend to be good but “always use 2 decimal place” is not a wise rule.

For things that can only be whole numbers, it is still okay to have decimals for averages (ie. average goals scored = 3.56 is fine).  

Above all else “sensible rounding” (using common sense for the scale and scenario) is the best approach.

Chapter 9- Writing well

Functional writing

Clear

complete

concise

considerate

concrete 

correct

be precise and concise avoid jargon, 

Sentences:

Begin with a capital

end with a full stop

have a subject

have a finite verb

express a complete thought.

avoid run on sentences

use an active voice instead a passive one.

use appropriate vocabulary and structure.

Chapter 5- Essay Writing

Essays

Cause and effect

Reactionary

Argumentative

Write a first draft and then:

go through the document and give each paragraph a heading

use view outline in word to view your headings, are they logical do they progress logically?

Where neccesary rearrange add or delete headings

go through process again and again until its logical.

State aims and purpose

make conceptual framework clear

state context

eg:

Topic

Taskword

content words

organisation

Plan intro body

conclusion

Paragraphs:

a) Topic Sentence

b) Supporting sentence

c) Clincher

quotes: if less then 30 words incoporate into your own text if more indent and italic and quotation marks.

Chapter 7- Presenting research findings

Reports v essays reports are a dispassionate analysis where essay is an opinion of the authors work drawing on secondary sources.

Reports should answer the 5 who what where when why, it may be in a highly structured way (to replicate) or not.

Key is writing the work in a way that makes it verifiable.

General Layout:

*well executed research

*The will to communicate effectively.

-Short Research and Lab Reports

Title page

executive summary

Introduction

Materials and methods

Results

Discussion and conclusion

References

One method is to have seperate sheets of paper or files for each section labelled and as you work through the report write in the appropriate section. starting with the easiest which is usually the methods section.

Title page

short and accurate

abstract

brief coherent and concise statement 100-250 words

Informative/indicative.

informatve (concise account of details)

indicative (summary of longer report not recounting everything)

Table of contents – accurate

introduction

What question is being asked and what do u hope to learn.

Literature review

Comprehensive and critical of work in the field like an overview.

-prevents duplication

-identifys gaps

-increases own knowledge

-identifys others in the field.

-puts own work in perspective.

a good literature review will discuss significant and notable books in the field and the available research on your narrow field.

provides reader with conceptual and disciplinary origins of your study. it is helpful and important to look at the whole field of study and draw together themes that you see from varous articles.

in review, should be enough information to replicate the study.

materials and methods

subjects and samples why did u choose them.

apparatus/material: what did u do

procedure: how did u do it.

results: what did u find out

discussion: what did the findings mean the main aim here is to explain results and explore significance.

recomendations

references

appendices.

should be written in third person language.

avoid jargon, move from general to specific