QUANTITATIVE RESEARCH - Universiti Utara...

Post on 06-Nov-2020

10 views 0 download

Transcript of QUANTITATIVE RESEARCH - Universiti Utara...

1

QUANTITATIVERESEARCH

NOR IDAYU MAHAT CENTRE FOR UNIVERSITY-INDUSTRY COLLABORATION (CUIC)

UNIVERSITI UTARA MALAYSIA

04-928 4098 / noridayu@uum.edu.my

METHODOLOGY

2

Contents

Basic concepts o Statistics and research

Sampling, techniques and procedures

Measurements: o Scale

o Adequacy, validity, reliability and sensitivity

Exploring your data

Statistical inference

Hypothesis testing

Analysis of difference

Complex analyses

3

Basic concept

Pure Basic

• Experimental and theory work undertaken to acquire new knowledge for the advancement of knowledge.

Strategic Basic

• Experimental and theoretical work undertaken to acquire new knowledge for specified broad areas in the expectation of useful discoveries.

Applied

• Original work undertaken to acquire new knowledge with a specific application in view, e.g. to determine possible uses for the findings of basic research.

Experimental

• Systematic work, using existing knowledge for the purpose of creating new or improved products/processes.

Research Activity Types

4

Example Research

Example: Modification on existing Control chart (Nor Idayu Mahat & Sharipah Soaad, 2011) This study discusses on the problem of constructing control charts for multi quality characteristics when the traditional Hotelling T2 fails to detect shifts in the mean or the relationship among the measured quality characteristics. Alternative control charts based on modified one-step M-estimator which is robust towards outliers is proposed to overcome this weakness..... Results from simulation studies proved that the proposed robust control charts offer better performance..... when the variables are independent or dependent.

5

Example Research

Example: The use of Principal Component Analysis in monitoring gear faults (Li et al., 2003) This paper presents a study that uses principal component analysis to reduce dimensionality of the feature space and to get an optimal subspace for machine fault classification.…. The experimental results indicate that the method extracts diagnostic information effectively for gear fault classification and has a good potential for application in practice.

6

Basic concept

Quantitative Research

Scientific application of mathematical principals to the collection, analysis and presentation of numerical data.

Mathematical principals (?)

Collection – knowledge to the design of surveys and experiments in order to get information

Analysis – processing and analysing the collected information to answer some questions

Presentation –interpret the results obtained from the analysis in some meaningful ways.

7

Basic concept

When Quantitative Analysis is needed?

There is a need to present and to interpret numerical data.

There is a need to test some defined statements mathematically.

The aim is to classify variables, count them, and construct statistical models in an attempt to explain what is observed.

Precise prediction is a major concern.

8

Basic concept: What is data?

Numbers

Measurements

Words

Figures

9

Basic concept: Types of data

o Secondary data

• data that has already been collected.

• It could be raw data or compiled data.

o Secondary sources:

• Hardcopies – books, articles, directories, conference papers, newspapers, magazines, research reports and market reports.

• Electronic resources – CD-ROM, on-line databases, internet, videos and broadcasts.

10

Basic concept: Types of data

o Primary data ~ the researcher collect the data herself.

o Methods

• Observation

• Experiment

• Interviews: face-to-face interview, focus group, panels

• Questionnaire

• Diaries

• Portfolios

11

Basic concept: Types of data

Secondary data Primary data

May not match your need. Commonly match to your need.

Access may be difficult or costly.

Original.

May save some costs and time.

Sometimes involve some costs and time.

Allow for longitudinal studies.

May be not appropriate for longitudinal studies.

Validity of some secondary data (e.g. internet sources)

Validity of the process in collecting the data.

12

Population and sample

Where can we get the data?

Population – all entities (people or items) with the characteristics one wishes to study.

Population structure describes the relative numbers of entities with similar characteristics.

Sample – Some of the entities from the population that one may have to answer questions about the population as a whole.

13

Population and sample

Principle of Sampling

o Entities in a sample must be

• taken from the target population following some standard precedures.

• able to represent the actual population.

• adequate to be used in the analysis parts.

• adequate to supply necessary information to the research questions.

14

Population and sample

Sample A

Sample B

Sample C

?

15

Basic concept of statistical tools

Before we decide to use either population or sample, let focus on statistical tools….

Descriptive statistics

Procedures to summarise and to describe the important characteristics of a set of measurements.

Arts of statistics.

Inferential statistics

Procedures to make inferences about population characteristics from information contained in a sample drawn the target population.

16

Basic concept of statistical tools

o Probability sampling

• All objects in the population will have equal chance to be chosen as sampel.

• Less bias sampling procedure.

o Nonprobability sampling

• Objects in a sample are usually selected on the basis of accessibility.

• Bias sampling procedure.

17

Sampling methods

Nonprobability sampling

5. Quota

6. Snow-ball

7. Convenience (opportunity)

8. Purposive

9. Self-selection

Probability sampling

1. Simple random

2. Systematic sampling

3. Stratified sampling

4. Cluster sampling (and

multi-stage)

18

Probability sampling

o Researcher must ensure that every object has equal opportunity for selection

o Randomisation is a must.

o The techniques are free of systematic and sampling bias.

19

Sampling methods: example

In the early stages of planning a school restructuring effort, school district board members are considering a year round schooling program. For the moment, the board is interested in the degree to which parents/legal guardians favor such a change. A simple random sample (n = 300) of parents/legal guardians was drawn from 1,850 families (only one adult per household) and given a questionnaire.

20

Sampling methods

1. Simple random sampling (pensampelan rawak mudah) Pilihan ideal bagi mendapatkan objek secara rawak.

Setiap objek untuk sampel perlu

o dipilih secara rawak daripada senarai populasi. o mempunyai peluang yang sama untuk terpilih.

Kekurangan

o Senarai populasi sukar diperolehi. o Kadang-kala sukar untuk mendapatkan objek yang

telah dikenalpasti.

21

Sampling methods

2. Systematic sampling (pensampelan sistematik)

Tatacara pensampelan

1. Sediakan senarai semua objek populasi.

2. Pilih objek pertama secara rawak daripada senarai populasi.

3. Pilih objek seterusnya pada selang ke-k daripada pilihan yang terdahulu.

4. Ulang proses pemilihan (3) sehingga bilangan objek yang diperolehi adalah memenuhi saiz sampel yang diperlukan.

22

1 11 21 31

2 12 22 32

3 13 23 33

4 14 24 34

5 15 25 35

6 16 26 36

7 17 27 37

8 18 28 38

9 19 29 39

10 20 30 40

List of student in Class A (5 students are needed for every 7 position)

23

Sampling methods

3. Stratified sampling (pensampelan berstratum)

Tatacara pensampelan

1. Setiap objek dalam populasi disusun mengikut kumpulan (strata) berpandukan atribut tertentu (e.g. jantina, sosio-ekonomi dan pendapatan)

2. Pilih sejumlah objek daripada setiap strata secara rawak mengikut

peratus sama banyak bagi setiap strata, atau

peratus berbeza mengikut strata.

24

Sampling methods

4. Cluster sampling (pensampelan berkelompok)

Pensampelan berkelompok

o hampir menyerupai kaedah pensampelan berstrata.

o Kelompok daripada populasi dipilih secara rawak, kemudian semua objek dalam kumpulan terpilih dijadikan sampel kajian.

Pensampelan multi-stage adalah sesuai bagi kes yang melibatkan struktur geografi.

25

Sampling methods

5. Quota sampling (pensampelan berkuota)

Hampir menyerupai kaedah pensampelan berstrata tetapi ia adalah tidak rawak.

Biasanya banyak digunakan

o dalam kajian yang melibatkan temuduga.

o Apabila saiz populasi adalah tidak terhingga.

26

Sampling methods

Researcher chooses proportion representation of objects depending on trait which is considered as the quota.

Example:

Gender Age (year) Quota

Male 20 – 29 56

30 - 44 104

Female 20 – 29 50

30 - 44 110

27

Sampling methods

6. Snowball sampling (pensampelan bola salji)

Kaedah ini sesuai apabila objek dalam populasi adalah sukar untuk dikesan.

Strategi pensampelan:

1. Penyelidik perlu mendapatkan objek pertama yang sesuai untuk kajian.

2. Objek kedua dan seterusnya dikenalpasti berdasarkan bantuan daripada objek yang telah dikenalpasti.

3. Objek dalam sampel adalah tidak rawak.

28

Sampling methods

7. Convenience: objek dipilih atas dasar mudah untuk diperolehi.

8. Purposive: penyelidik memilih hanya objek yang bersesuaian untuk mencapai objektif kajian.

9. Self-selection: sampel bagi kaedah ini terdiri daripada objek yang menyertainya secara sukarela.

29

More sampling methods

Line-intersect sampling

Elements are chosen in a region whereby an element is sampled in a chosen line segment.

Panel sampling

A sampling group is chosen (usually by random), and is asked for the same information repeatedly over a period of time.

Event sampling

Behaviour of interest is collected at the specified interval.

30

More sampling methods: Hypothetical data

A set of data that is generated randomly from some known distribution(s).

When hypothetical data set can be used?

To test performance of a new model/approach under in-control condition.

To help a researcher to identify some possible problems with the proposed model / approach.

31

Hypothetical data: Example

Phase I: construction of control chart

Step 1 Generate 5000 samples of observations, Xij, i=1,2,..,p and j=1,2,..,n from Np(0,Ip).

Step 2 Compute the robust location and scale estimates for each sample.

Step 3 Randomly generate a new observation, Xi, from Np(0,Ip).

Step 4 Compute the respective T2.

Step 5 Identify the UCL at the 95th (99th) percentile of the 5000 T2 in Step 4.

Step 6 Generate 1000 samples of observations, Xij, i=1,2,..,p and j=1,2,..,n from contaminated model.

Step 7 Compute the robust location and scale estimates for each sample.

32

Checklist….

• Population vs. Sample

• Objects/respondents

• Variables vs. Constant value

• Parameter vs. Estimator

• Randomness

• Types of data:

• Cross-sectional

• Time series

• Functional series

• Spatial data

33

Errors in research activities

Sampling error – caused by sampling design

Selection error

Estimation error

Non-sampling error – caused by mistakes in data processing

Over / under coverage

Processing error

Non-response

Measurement error

34

Measurement

Constant value – an actual value or a specific character whose value does not change.

Variable – a character with values that may vary.

Level of measurement:

Nominal

Ordinal

Interval

Ratio

35

Nominal

Which of the following daily newspapers have you read during the past month?

Read Not read Don’t know

The Star

The New Straits Times

Berita Harian

36

Ordinal

One can ask respondent to place things in rank order. Example: Please number each of the factors listed in order of

importance in your choice of a new car.

a. Price ____

b. Fuel economy ____

c. Acceleration ____

d. Safety features ____

Otherwise, one may use the common scales such as Likert, semantic differential scale, Guttman scale and Thurstone scale.

37

Measurement

Data

Categorical Quantifiable

Nominal Ordinal Interval Ratio

Increasing precision

38

Exploring your data

It is a good practice to understand your data before any complex analysis is performed.

Objective:

o To identify some strange behaviour.

o To determine a suitable technique that can be employed to the data.

o For validation purposes.

o To make better interpretation on the obtained results.

39

Exploring your data

• Missing value: Objects with no value in some variables.

• Some strategies to handle missing value:

o Exclude objects with missing value.

o Replace a missing value by the mean of all available values for the relevant variable.

o Imputation: missing values can be replaced by some suitable numerical entries.

40

Exploring your data

• Outliers: Values that are distinctly different from other values.

• Outliers may contribute to biased estimated value and this leads to give misleading results.

• Strategies to handle outliers: o Outliers due to recording errors should be

corrected. o If the values are genuine then some thought

must be given as to whether or not they should be retained.

41

Exploring your data

The effect of an outlier in computing the average value.

Sales (RM) Sales (RM)

70.63 70.63

56.28 56.28

70.98 70.98

7.00 70.00

68.42 68.42

56.74 56.74

60.04 60.04

55.73 64.73

42

How to explore your data?

Tabular display

Plot (e.g. histogram, bar chart etc.) o Better than statistic values but limited to 2 or 3

variables at one time.

Statistical values o Common statistical values can be used such as mean,

variance etc.

Map the data • Can be done using e.g. Principal Component Analysis,

Factor Analysis, Data Dimensional Scaling etc.

43

Tabular display

Sex

87 32.2 32.2 32.2

183 67.8 67.8 100.0

270 100.0 100.0

Female

Male

Total

Valid

Frequency Percent Valid Percent

Cumulativ e

Percent

Frequency table

Cross tabulation

Number of repeated exams

1 2 3 4 Total

Sex Female 59 12 12 4 87

Male 101 46 21 15 183

Total 160 58 33 19 270

What information can be extracted from these tables?

44

Tabular display

Bad presentation

45

Pie chart

8

16

2625

18

7

Years Experience

5 or less

6-10

11-15

16-20

21-35

36 or more

GOOD

BAD

46

Line chart / series plot

For continuous measurements.

Often used to highlight some patterns or behaviour of the target variable.

47

Scatter plot

48

Bar chart

Alternative presentation for table.

For categorical measurement only.

Sometimes can be useful to identify the distribution of the data.

49

Box and Whiskers

Suitable for numerical values.

This plot summarises some important statistics (and features) which include:

o Median

o Quartiles

o Potential outliers

50

Histogram

51

Numerical values

The centre (middle) of the distribution of measurements.

Some measurements:

o Mode

o Median

o Sum

o Arithmetic mean

o Trimmed mean

o Robust mean

52

Numerical values

Represent how the data scatter around the centre point, i.e. central tendency values.

Some measurements:

o Range

o Percentile

o Quartiles ; interquartile range (IQR)

o Variance

o Standard deviation; coefficient of variation (CV)

o Standard error of mean

53

Weakness of descriptive tools

Descriptive statistics cannot give broader statement about the difference and relationships between data.

They cannot draw conclusions and making predictions about the properties of a population if the information obtained from sample.

54

Statistical inference

Why inference about population is necessary?

o Sometimes relevant facts are abundant.

o Plots may yield conflict opinions regarding conclusions among decision makers.

o Humans are incapable of utilising large amounts of data.

So, information contained in a sample is used to make inferences about a population. Common methods are

o estimation.

o statistical hypothesis testing.

55

Statistical inference

Estimation: a process that will predict a value of a parameter of interest. It answers the following question

• What is the value of the population parameter?

• Example: What is the average salary of Malaysians?

Statistical hypothesis testing: a procedure that test a hypothesis about the value of a parameter of interest. It answers the following question

• Is the parameter value equal to this specific value?

• Is it true that Malaysians earn RM2200 monthly?

56

Hypothesis testing

Step 1: Formulate hypotheses.

Step 2: Identify an appropriate test statistic to assess

the hypotheses.

Step 3: Compute the test statistic (or the p-value).

Step 4: Compare the test statistic (p-value) to a related

distribution value (identified alpha, α).

Step 5: Make decision and conclusion.

57

Hypothesis testing

• Null hypothesis (H0): hypothesis with no effects, e.g. the process change makes no different.

• Alternative hypothesis (H1): a choice that can be considered if H0 can be ruled out, e.g. the process change has an effect.

58

Hypothesis testing

59

Hypothesis testing: identifying test statistics

• Test statistic: a quantity computed from the sample data.

• Test statistic vs. distribution value (e.g. normal dist., chi-square dist etc.)

• p-value: probability that the obtained test statistic is likely to reject H0.

• Also known as level of significance.

• p-value vs. identified value of α.

60

Hypothesis testing: decision making

Choose either one:

If p-value is less than or equal to α means we have enough evidence to reject H0.

If p-value is greater than α, then we do not have enough evidence to reject H0 (but it doesn’t mean that H0 is true).

61

ANALYSIS OF DIFFERENCE

One population comparison

Two populations comparison

Multiple populations comparison

62

One Population Comparison

To test the central values for a target population. Various

hypotheses testing:

Two-tail test

One-tail tests

01

00

:H

:H

CT

CT

01

00

:H

:H

CT

CT

01

00

:H

:H

CT

CTor

µ0 value is known. The value might be obtained from some previous studies, experts’ opinion etc.

63

One Population Comparison

Parametric methods

o Robust if the population is normally distributed.

o Strategy:

1. Write a research hypothesis.

2. Choose an appropriate test statistics (either Z-statistics or T-statistics) and calculate its value (or p-value) based on the obtained sample.

3. Check for the rejection region. Reject H0 if p-value is less than the fixed value of type one error, α.

4. Draw conclusions.

64

One Population Comparison

Non-parametric methods

o Might be best methods when the population distribution is highly skewed or heavily tailed.

o Often, median is used.

o Example methods: sign test and Binomial test.

o Strategy:

1. Identify the value of population median.

2. Values are ordered from the smallest to the largest.

3. Sample median, , is calculated.

4. Compare the sample median and the population median.

65

Example

Let say that normally, the average number of passengers fly with a local flight during school breaks is 270 thousands.

So, we might be interested to check whether this number (270) maintain for the current situation.

Mode = 229.00

Median = 265.50

Mean =

280.30

66

Example

Parametric test’s result:

Non-parametric test’s result:

67

Two Populations Comparison

Aim: to compare a central value of two different populations. (Need to consider whether both populations have a homogeneous variance).

Inferences about : Independent samples with three different cases:

o Both population distributions are normally distributed with equal variance.

o Both sample sizes are large.

o The sample sizes are small and the population distributions are non-normal.

21

68

Two Populations Comparison

211

210

:H

:H

211

210

:H

:H

211

210

:H

:H

Two-tail test:

One-tail tests:

or

Parametric tests: - Independent samples t-test with equal variances. - Independent samples t-test with unequal variances. Non-parametric test: - Mann-Whitney U test - Wilcoxon Rank Sum test

69

Example

An experiment was conducted to evaluate the effectiveness of a treatment for tapeworm in the stomachs of sheep. A random sample of 24 worm-infected lambs of approximately the same age and health was randomly divided into two groups: drug-treated sheep and untreated sheep.

70

Example: initial data analysis

What is your expected result?

Drug treated

Untreated

71

Parametric test’s result

Non-parametric test’s result:

72

Two Populations Comparison

Inferences about : Paired data

Appropriate for studies in which measurement in one sample is matched or paired with a particular measurement in the other sample.

Hypothesis

21

0211

0210

:H

:H

D

D

0211

0210

:H

:H

D

D

0211

0210

:H

:H

D

D

Two-tail test:

One-tail tests: or

73

Example

To compare the wearing qualities of two automobile tires, A and B, a tire of type A and one type of B are randomly assigned and mounted on the rear wheels of each of five automobiles. The automobiles are then operated for a specified number of miles, and the amount of wear is recorded for each tire.

Automobile Tire A Tire B

Mean (A) = 10.24

Mean (B) = 9.76

Std. dev (A) = 1.32

Std. dev (B) = 1.33

1 10.6 10.2

2 9.8 9.4

3 12.3 11.8

4 9.7 9.1

5 8.8 8.3

74

Example

Independent Samples Test

.003 .960 .574 8 .582 .4800 .8362 -1.4482 2.4082

.574 7.999 .582 .4800 .8362 -1.4483 2.4083

Equal variances

assumed

Equal variances

not assumed

wear

F Sig.

Lev ene's Test for

Equality of Variances

t df Sig. (2-tailed)

Mean

Dif f erence

Std. Error

Dif f erence Lower Upper

95% Conf idence

Interv al of the

Dif f erence

t-test f or Equality of Means

Paired Samples Test

.4800 .0837 .0374 .3761 .5839 12.829 4 .000wearA - wearBPair 1

Mean Std. Dev iation

Std. Error

Mean Lower Upper

95% Conf idence

Interv al of the

Dif f erence

Paired Dif ferences

t df Sig. (2-tailed)

75

Multi-Populations Comparison

To check whether k populations share the same value of central tendency value.

76

Multi-Populations Comparison

A factory produces disc brakes for high-performance automobiles. The following table summarises the average production of four machines. The target diameter for the brake is 322 mm.

Disc Brake Diameter (mm)

321.9985 322.0143 321.9983 321.9954

.0111568 .0106913 .0104812 .0069883

Mean

Std. Dev iation

1 2 3 4

Machine Number

77

Multi-Populations Comparison

Total variation

= variation within groups + variation between groups

78

Multi-Populations Comparison

Hypothesis testing:

Parametric test

o One-way ANOVA

Nonparametric test

o Kruskal-Wallis H

o Median test

different are spopulation least twoat :

...:

1

210

H

H k

79

Parametric test’s result:

Nonparametric test’s result:

80

Think!!

Job satisfaction was investigated in two different factories A and B. In factory A the employees are on a fixed shift system while in factory B the workers have a rotating shift system. In factory A, a worker always works the same shift, while in factory B, a worker rotates through the three shifts. A satisfaction score was collected from each employee and the aim is to identify difference in job satisfaction between the two groups of workers.

Q: What information needed in order to determine the choice of test?

81

MEASUREMENT ADEQUACY

o Validity

• Does the instrument measures what it is supposed to?

o Reliability

• Does the instrument consistently measure what it is supposed to?

o Sensitivity

• How good the instrument in detecting the smallest amount that it can measure?

82

Validity

• In general, there are two types; Internal and external validity.

• Internal validity refers to the rigor with which the study was performed.

• Design of the study

• Measurements chosen

• Factors involved especially in a study of causal relationships

• External validity refers to the extent to which the results of a study are generalisable or transferable (authenticity).

83

Internal validity

Face validity

Content validity

Criterion-related validity

Predictive validity occurs when the criterion measures are obtained at a time after the test e.g. career tests.

Concurrent validity occurs when the criterion measures are obtained at the same time as the test scores e.g. level of depression.

Construct validity

Convergent

Discriminant

84

1. Face validity

It is the basic and minimal index of validity.

It is concerned with how a measure or procedure appears and understandable by to the respondents.

Does it seem well designed?

Does it seem as though it will work reliably?

Testing strategy: a set of questionnaire is given to a sample of respondents to judge their reaction to the items.

85

2. Criterion-Related Validity

Also known as instrumental validity.

It demonstrates the accuracy of a measure or procedure by comparing it with another measure or procedure which has been demonstrated to be valid.

Example: let say we have a hands-on driving test that has been shown to be an accurate test of driving skills. Then, one propose to a new written driving test. Then, the written test can be validated by using a criterion related strategy in which the hands-on driving test is compared to the written test.

86

2. Criterion-Related Validity

Predictive validity

Indicates the ability of the measuring instrument to differentiate among individuals on a future criterion.

Example: employees ability test

Concurrent validity

Indicates the ability of the measuring instrument to differentiate among individuals who are known to be different (they should score differently on the instrument).

Example: work ethic among welfare recipients.

87

3. Construct validity

Construct validity testifies to the agreement between a theoretical concept and a specific measuring device or procedure.

Example: A doctor would like to test the effectiveness of painkillers on chronic back sufferers. Every day, he asks the test subjects to rate their pain level on a scale of one to ten. In this case, construct validity would test whether the doctor actually was measuring pain and not numbness, discomfort, anxiety or any other factor.

88

3. Construct validity

Convergent validity is the actual general agreement among ratings, gathered independently of one another, where measures should be theoretically related.

The scores obtained by two different instruments measuring the same concept is highly correlated.

Discriminate validity is the lack of a relationship among measures which theoretically should not be related.

Two variables are predicted to be uncorrelated and the scores obtained by measuring them are indeed empirically found to be so.

89

3. Construct validity

Strategy to achieve construct validity:

Literature review

Confirmatory factor analysis

Correlation analysis

Some multivariate analyses

90

4. Content validity

Content validity ensures measures include an adequate and representative set of items that tap the concept.

Example:

1. A researcher needing to measure an attitude like self-esteem must decide what constitutes a relevant domain of content for that attitude.

2. In socio-cultural studies, content validity forces the researchers to define the very domains they are attempting to study.

91

4. Content validity

Strategy to achieve content validity:

Existing literature

Qualitative research

Judgment of panel of experts

92

Reliability

Reliability is defined as the extent to which an instrument consistently measures what it is supposed to.

Classical test theory – a ratio of variation between the true score and the observed score.

The true-score model

93

Approaches to estimate reliability

1. Equivalency reliability

2. Stability

o Test-retest reliability

o Parallel-form reliability

3. Internal consistency

o Inter-item consistency reliability

o Split-half reliability

4. Inter-rater reliability

94

1. Equivalency reliability

o Equivalency reliability is the extent to which two items measure identical concepts at an identical level of difficulty.

o Equivalency reliability is determined by relating two sets of test scores to one another to highlight the degree of relationship or association.

95

2. Stability

A set of measures is consider stable if it has an ability to maintain stability over time despite of uncontrollable conditions or the state of the respondents themselves.

Example: The method of maintaining weights used by the U.S. Bureau of Standards. Platinum objects of fixed weight (one kilogram, one pound, etc...) are kept locked away. Once a year they are taken out and weighed, allowing scales to be reset so they are "weighing" accurately. Keeping track of how much the scales are off from year to year establishes a stability reliability for these instruments. In this instance, the platinum weights themselves are assumed to have a perfectly fixed stability reliability.

96

2. Stability

Test-retest reliability is the correlation between two successive measurements with the same test.

Example:

you can give your test in the morning to your pilot sample and then again in the afternoon. The two sets of data should be highly correlated if the test is reliable.

97

2. Stability

Parallel-form reliability is the successive administration of two parallel forms of the same test.

Examples:

There are two versions that measure Verbal and Math skills in SAT. Two forms for measuring Math should be highly correlated and that would document reliability.

In an exam, two groups of students are given questions having similar items and the same response format, with only different in wording and the ordering of questions.

98

3. Internal consistency

It indicates the homogeneity of the items in the measure that tap the construct.

Example: a questionnaire was designed to find out about college students' dissatisfaction with a particular textbook. Then, a researcher needs to analyzing the internal consistency of the survey items dealing with dissatisfaction which reveal the extent to which items on the questionnaire focus on the notion of dissatisfaction.

99

3. Internal consistency

Inter-item consistent tests the consistency of respondents’ answers to all items in a measure.

In other words, it ensures that the items are homogeneous or all measuring the same construct.

Statistical procedures like KR-20 (Kuder-Richardson) or Cronbach's Alpha are commonly use for these purposes.

100

3. Internal consistency

Split-half reflects the correlation between two halves of an instrument.

Example: you have the SAT Math test and divide the items on it in two parts. If you correlated the first half of the items with the second half of the items, they should be highly correlated if they are reliable.

101

4. Inter-rater reliability

Inter-rater reliability reflects the consistency of the judgment of several raters on how they interpret the responses. In other words, it is the extent to which two or more individuals (coders or raters) agree.

Scenario: Two or more researchers are observing a high school classroom. The class is discussing a movie that they have just viewed as a group. The researchers have a sliding rating scale (1 being most positive, 5 being most negative) with which they are rating the student's oral responses. Inter-rater reliability assesses the consistency of how the rating system is implemented.

102

Power of a statistical test

It is the probability of rejecting the null hypothesis when the null hypothesis is false.

Power also represents the sensitivity of the undertaken analysis.

Factors influencing power: (i) the statistical significance criterion (alpha value), (ii) magnitude of the effect under alternate hypothesis (effect size) and (iii) sample size.

103

Complex analysis

Number of variables

1 variable 2 variables More than 2

variables

Homogeneous sample?

Choosing the right statistical tool

104

Bivariate analysis

BIVARIATE studies two variables simultaneously.

Common studies

• Correlation – measuring relationship between two continuous variables.

• Cross tabulation - measuring relationship between two categorical (or binary) variables.

• Simple modelling – a study involves in finding the best curve (e.g. straight line) that best explain how a variable (independent variable) influences the other variable (dependent variable).

105

MULTIVARIATE ANALYSIS

Multivariate data arise when more than one variable or measurement is made on each object.

Data arrangement

Type of studies:

o Descriptive multivariate studies

o Inferential studies

o Modelling and prediction

npnn

p

p

xxx

xxx

xxx

...

..

..

...

...

21

22221

11211

106

Multivariate analysis

Interdependence methods Involve only either independent variables or

dependent variables. Aim: to seek for patterns or any hidden

information. Methods: principal component analysis, factor

analysis, multidimensional scaling, cluster analysis, projection pursuits etc.

Dependence methods

Both independent variable and dependent variable(s) are measured.

Methods: multiregression, discriminant analysis, MANOVA, canonical analysis, SEM etc.

107

~: The End :~