Sponsor Area
Define the mean.
The mean is the value which is derived by summing all the values and dividing it by the member of observation.
What are the advantages of using mode?
Mode is the maximum occurrence or frequency at a particular point or value. Mode is a measure that is less widely used compared to mean and median.
What is dispersion?
The term dispersion refers to the scattering of scores about the measures of central tendency. It is used to measure the extent to which individual items or numerical data tend to vary or spread about an average value. Thus the dispersion is the degree of spread or scatter or variations of measures about a central value.
Define Correlation.
Correlation refers to the nature and strength of correspondence or relationship between two variations. The terms nature and strength in definition refer to the direction and degree of the variables with which they vary.
What is perfect correlation?
The maximum degree of correspondence or relationship goes up to 1 (one) in mathematical terms. On adding an element of the direction of correlation it spreads the maximum extent of -1 to +1 through zero. It can never be more than one. Correlation of 1 is known as perfect correlation (whether positive or negative). Between the two points of divergent perfect correlation lies 0 (zero) correlation a point of no correlation or absence of any correlation between the variables.
What is the maximum extent of correlation?
The maximum extent of correlation 1 (one) in mathematical term. It can never be more than one.
What is perfect correlation ?
The maximum degree of correspondence or relationship goes up to 1 (one) in mathematical terms. On adding an element of the direction of correlation it spreads the maximum extent of -1 to +1 through zero. It can never be more than one. Correlation of 1 is known as perfect correlation (whether positive or negative). Between the two points of divergent perfect correlation lies 0 (zero) correlation a point of no correlation or absence of any correlation between the variables.
Explain relative position of mean, median and mode in a normal distribution and skewed distribution with the help of diagram.
The three measures mean, median and mode of the central tendency could easily be compared with the help of normal distribution curve which is given below:
Fig. 2.4 : Normal Distribution Curve
The normal distribution has an important characteristic. The mean, median, mode are the same score because a normal distribution is symmetrical. The score with the highest frequency occurs in the middle of the distribution and exactly half of the scores occur above the middle and half of the scores occur below. Most of the scores occur around the middle of the distribution of the mean. Very high and very low scores do not occur frequently and are, therefore, considered rare.
If the data are skewed or distorted in some way the mean median and mode will not coinside and the effect of the skewed data needs to be considered in the following figure.
Fig. 2.5 : Positive Skew
Fig. 2.6 : Negative Shew
Comment on the applicability of mean, median and mode.
Mean:
1. Simplicity : This is most simple of all the measures of central tendency.
2. Representative Value : This is based on the items in a series and is, therefore, a representative value of different items.
3. Certainty : Arithmetic mean is a value. It has no scope for estimated values.
4. Stability : Arithmetic mean is a stable measure of central tendency.
5. Basis of comparison : It can be easily used for comparison because it is stable and certain.
Median :
1. Median is difinite.
2. It is easy to calculate and understand median.
3. It can also be determined graphically.
4. It is not affected by extreme values.
5. It can be calculated in the absence of anyone of the item.
6. It is helpful in qualitative facts such as ability, stability etc.
7. It is useful in measuring dispersion.
Mode :
1. It is simple precise and easy to understand.
2. It is not affected by extreme items because it is not based on every item of the series.
3. It cannot be given further mathematical treatment.
4. It can be located on graph.
Explain the process of computing Standard Deviation with the help of an imaginary example.
Standard deviation is the square root of the arithmetic mean of the squares of deviation of the items from their mean value. It is precise measure of dispersion and is denoted by a Greek tetter *** (small sigma).
Computation of Standard Deviation :
Following formula is used to calculate the standard deviation for ungrouped data :
Where σ = Standard deviation (S.D.)
= Sum total of squares of deviation
N = Number of items
The above formula becomes rather tedious if the value of X involves decimal points and also if the number of observations is very large. We may then use the following short cut method :
Example : Following table shows the rainfall figures of last ten years. Calculate the standard deviation.
Year |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
Rainfall (in cm.) |
100 |
90 |
120 |
110 |
80 |
70 |
150 |
130 |
50 |
100 |
Which measures of dispersion is the most unstable statistic and why?
It can be observed that the mean derived from the two data sets given as under :
A
Scores of individual |
|
Individual |
Score |
XI |
52 |
X2 |
55 |
X3 |
50 |
X4 |
48 |
X5 |
45 |
B
Scores of Individual |
|
Individual |
Score |
X1 |
28 |
X2 |
00 |
X3 |
98 |
X4 |
55 |
X5 |
69 |
is same i.e. 50. The highest and the lowest score shown as above table A 55 and 45 respectively. The distribution in table B has a high score of 98 and a low score of zero. The range of first distribution is 10 whereas it is 98 in the second distribution. Although the mean for both the groups is the same, the first group is obviously stable or homogeneous as compared to the distribution of score of the second group which is highly unstable or heterogeneous.
Write a detailed note on the degree of correlation.
When reference has been made about the direction of correlation, negative or positive, a natural curiosity arises to know the degree of correspondence or association of the two variables. The maximum degree of correspondence or relationship goes upto 1 (one) in mathematical terms. It can never be more than one. Correlation of 1 is known as perfect correlation (whether positive and negative). Between the two points of divergent perfect correlation lies 0 (zero) correlation a point of correlation or absence of any correlation between the variables.
What are various steps for the calculation of rank order correlation?
The following steps are as under for the calculation of rank order correlation:
(i) Copy the data related to X-Y variable given in the exercise and put them in the first and second column of the table.
(ii) Both the variables are to be ranked separately. The ranks of X-variable are to be recorded in third column headed by XR (Rank a/x). Similarly the ranks ofY-variables (YR) are to be recorded in the fourth column.
(iii) Now since both XR and YR have been obtained find the difference between two sets of ranks and record it in the fifth column.
(iv) Each of these difference is squared and sum of the column of square is obtained. This
value is placed in sixth column.
(v) Then the computation of the rank correlation is done by the application of the following equation :
Where P = Rank correlation
Σ D2 = Sura of the squares of the difference between two sets of ranks.
N = The number of Pairs of X-Y.
Take an imaginary example applicable to geographical analysis and explain direct and indirect methods of calculating mean from ungrouped data.
Direct method : The following table gives rainfall figures of a place calculation in mean by direct method:
Rainfall (in mm) |
30-35 |
35-40 |
40-45 |
45-50 |
50-55 |
55-60 |
60-65 |
65-70 |
70-75 |
No. of days |
5 |
6 |
11 |
18 |
19 |
15 |
13 |
1 |
2 |
(i) Direct Method
Class (Rainfall) (in mm) |
No. of days (Frequency) (f) |
Mid-Point m |
fm |
Class (Rainfall) (in mm) |
No. of days (Frequency) (f) |
Mid-point m |
fm |
30-35 |
5 |
32.5 |
162.5 |
55-60 |
15 |
57.5 |
862.5 |
35-40 |
6 |
37.5 |
225.0 |
60-65 |
13 |
62.5 |
812.5 |
40-45 |
11 |
42.5 |
467.5 |
65-70 |
1 |
67.5 |
67.5 |
45-50 |
18 |
47.5 |
855.0 |
70-75 |
2 |
72.5 |
145.0 |
50-55 |
19 |
52.5 |
997.5 |
||||
n=Σf=90 |
Σfm = 4595.0 |
(ii) Indirect Method : The folllowing formula is used in computing the measuring indirect method :
Where :
A = Subtracted constant, d = Sum of the coded scores.
N = Number of individual observation in a series.
Example : Assumed Mean = 50
Rainfall |
Mid-values |
dx = X–A |
No. of days |
fdx |
||
30–35 |
32.5 |
32.5–50 |
= |
–17.5 |
5 |
5×–15.5 = –87.5 |
35–40 |
37.5 |
37.5–50 |
= |
–12.5 |
6 |
6×–12.5 = –75 |
40–45 |
42.5 |
42.5–50 |
= |
–7.5 |
11 |
11×–7.5 = –82.5 |
45–50 |
47.5 |
47.5–50 |
= |
–2.5 |
18 |
18×–2.5 = –45.0 |
50–55 |
52.5 |
52.5–50 |
= |
+2.5 |
19 |
19×+2.5 = 47.5 |
55–60 |
57.5 |
57.5–50 |
= |
+7.5 |
15 |
15×+7.5 = 112.5 |
60–65 |
62.5 |
62.5–50 |
= |
+12.5 |
13 |
13×+12.5 = 162.5 |
65–70 |
67.5 |
67.5–50 |
= |
+17.5 |
1 |
l×+17.5 = 17.5 |
70–75 |
72.5 |
72.5–50 |
= |
22.5 |
2 |
2×+22.5 = 45.0 |
N = 90 |
Σfdx = 95 |
Sponsor Area
Draw scatter plots showing different types of perfect correlations.
Following are the values of X and Y variables:
X |
4 |
6 |
8 |
10 |
12 |
14 |
16 |
Y |
8 |
10 |
12 |
14 |
16 |
18 |
20 |
Solution :
The following table shows that there is perfect positive correlation between X and Y variables
What is Median?
Median is that value which divides a series into two equal parts, one part comprising all values greater and the other values less than the median. It is the middle value of the series when arranged in ascending or descending order.
M = Median
N = Number of series.
What is meant by Mode?
The value of the variable which occurs most frequently is called Mode. In other words, it is the value which has the highest frequency in the series. For example, the daily wages of 10 workers in a factory are Rs. 20, 21,23, 23, 23, 23, 25, 26, 26. Here the mode or mode value is Rs. 23.
What do you mean by variation or dispersion?
Dispersion: It is known as ‘second degree average of series.’ It is scatterness of various items of the series. It is a measure of the variation of items from their central tendency.
What is coefficient of variation?
Coefficient of Variation : This concept is used to make comparison of dispersion or variation between two or more series. It is relative measure of variation. In the words of Karl Pearson, “Coefficient of variation is the percentage variation in the mean, the standard being treated as the total variation in the mean”. In fact, coefficient of variation is the percentage expression of standard deviation. Coefficient of
Where σ = standard deviation and Arithmetic mean.
What are the differences between Mean and Mode?
Mean |
Mode |
1. Arithmetic mean is the average of all items in a series. It is obtained by adding together all the items and dividing the total by the number of items. 2. It can be easily calculated and is simple to understand. 3. It is affected by the presence of every item in the group and hence is based on all the observations. It may be too much affected by extreme items. 4. It can be used for further mathematical exercises. 5. It cannot be located on graph. |
1. It is the value of variable which occurs most frequently. It is the value which has the highest frequency in the series. 2. It is simple, precise and easy to understand. 3. It is not affected by extreme items because it is not based on every item of the series. 4. It cannot be given further methematical treatment. 5. It can be located on graph. |
How many types of central tendencies are there?
Central Tendencies : According to Crum and Smith, “An average is sometimes called a measure of central tendency, because individual values of variables cluster around it.”
Types of Central tendencies or types of averages : Averages are broadly classified into two categories (i) Mathematical Averages and
(ii) Positional Averages.
Define the term mode. What are its merits and demerits?
Mode is a measure of central tendency of statistical series. Mode is the most frequently occurring value in a series. It is typical value around which of the item stand to cluster. It is the representative value of a series around which there is maximum concentration.
Merits of Mode : The followings are the merits of mode :
(i) Simpler and popular : Mode is very simple measure of central tendency and a glance of the series is enough to locate the model value.
(ii) Less effects of extreme marginal values : It is less affected by extreme and marginal values as compared to mean values.
(iii) Best representation: Mode is the best representation of the series because it occurs most frequently in the series.
(iv) Knowledge of all frequencies not essential: It is sufficient to know the item with highest frequency in the distribution.
(v) Practical utility : Mode is practically useful.
(vi) Graphic determination : Mode can be determined graphically also. This makes it very pimple and easy to understand.
Demerits of Mode :
(i) Uncertain and unclear : It is uncertain and unclear measure of central tendency. An addition of item with highest frequency in the series changes the entire complexion of the series.
(ii) Not based on the observation : It is not based on all observations because it represents item of highest frequency only.
(iii) Misleading : Because it is not based on all the observations of the series.
(iv) Ignores extreme values : Mode does not take into account the extreme values and is not suitable for a series where extreme values are also to be given importance.
(v) Affected by magnitude of class interval : It is affected by magnitude of class interval and changes with the change of magnitude of class interval.
What is median? Give its merits and demerits.
Median : It is that value of the variable which divides the group into two equal parts, one part comprising all values greater and other values less than the median. In symbols,
M = Median.
N = Number.
Merits
1. Median is definite.
2. It is easy to calculate and understand median.
3. It can also be determined graphically.
4. It is not affected by extreme values.
5. It can be calculated in the absence of any one of the items.
6. It is helpful in qualitative facts such as ability, stability, etc.
7. It is useful in measuring dispersion.
Demerits
1. In it all the items of a series are not given equal importance.
2. If the number of items are even, the correct value of median cannot be calculated.
3. It is affected by fluctuations of sampling.
4. It cannot be given further algebraic treatment.
5. Data needs to be arranged in ascending or descending order.
What is Standard Deviation? How does it differ from Mean Deviation? What are its advantages and disadvantages?
Standard Deviation : Karl Pearson introduced the concept of Standard Deviation in 1893. It is the most popular measure of dispersion since it does not suffer from the defects and limitations which other measures of deviation have. It can be defined as the square root of the mean of the squared deviations taken from the arithmetic mean. It is also called the root mean square deviation. Greek letter ‘s’ (read as sigma) is used to denote the standard deviation.
Features of Standard Deviation. Some features regarding standard deviation should be noted. They are : (i) Greater the amount of standard deviation, greater shall be dispersion or variability. In other words, smaller standard deviation means more homogeneity of data and vice-versa. (ii) If two distributions have the same mean, the one with the smaller standard deviation has a more representative mean.
Advantages of Standard Deviation :
(1) Based on all values : The calculation of Standard Deviation is based on all the values of a series. It does not ignore any value.
(2) Certain Measure : Standard Deviation is a clear and certain measure of dispersion. Therefore, it can be used in all situations.
(3) Little Effect of Sampling : Change in sampling causes little effect on Standard Deviation. This is because deviation is based on all the values of a sample.
(4) Algebraic Treatment : Standard Deviation is capable of further algebraic treatment.
Disadvantages of Standard Deviation :
(1) Difficult : Standard Deviation is difficult to calculate or understand.
(2) More importance to Extreme Value : In the calculation of standard deviation extreme values get greater importance.
Distinction between Mean Deviation and Standard Deviation :
Both of them are based on all the items of a distribution, but they are different from each -other in the following ways :
(i) In calculating the mean deviation algebraic signs are ignored. But they are considered in calculating the standard deviation.
(ii) Median or mean is used in calculating the mean deviation. But only mean is used to calculate the standard deviation.
With the help of data given calculate Standard Deviation.
15, 18, 20, 12, 10, 9, 11.
Calculation of Standard Deviation:
Values |
X – |
d2 |
X |
|
|
15 |
3 |
9 |
18 |
6 |
36 |
20 |
8 |
64 |
12 |
0 |
0 |
10 |
-2 |
4 |
9 |
-3 |
9 |
11 |
-1 |
1 |
N = 7 |
Σd = 11 |
Σd2 = 123 |
Apply the formula = Standard Deviation or
After putting the values
The following Table gives the height and weight of 10 students in a class. Draw a scatter diagram and interprete whether the correlation is positive or negative.
Height (cms). |
180 |
150 |
158 |
165 |
175 |
163 |
145 |
195 |
180 |
155 |
Weight (kgs.) |
65 |
54 |
55 |
61 |
60 |
54 |
50 |
63 |
65 |
50 |
Scatter diagram : Let us denote the height as X and weight as Y to plot the two variables.
Fig. 2.8:
Interpretation : The two variables have a high degree of positive correlation since the dots cluster around the line that moves upwards from the left-hand corner to the right hand corner.
Advantages of Scatter Diagram : (i) It is a simple method which can be followed with ease.
(ii) It is not affected by the value of the extreme items.
Limitations of Scatter Diagram : Although it gives a bird’s eye view of the relationship between two variables, yet it gives no exact idea about the degree of correlation. Besides, it cannot be treated mathematically.
Define the term mode. What are its merits and demerits?
Mode is a measure of central tendency of statistical series. Mode is the most frequently occurring value in a series. It is typical value around which of the item stand to cluster. It is the representative value of a series around which there is maximum concentration.
Merits of Mode : The followings are the merits of mode:
(i) Simpler and popular : Mode is very simple measure of central tendency and a glance of the series is enough to locate the model value.
(ii) Less effects of extreme marginal values : It is less affected by extreme and marginal values as compared to mean values.
(iii) Best representation: Mode is the best representation of the series because it occurs most frequently in the series.
(iv) Knowledge of all frequencies not essential: It is sufficient to know the item with highest frequency in the distribution.
(v) Practical utility : Mode is practically useful.
(vi) Graphic determination : Mode can be determined graphically also. This makes it very simple and easy to understand.
What are Quartiles, Deciles and Percentiles?
The values which divide a series into four, ten and hundred parts are known as Quartiles, Deciles and Percentiles respectively.
Define measures of central tendency.
The values which are representative of the various distributions are known as measures of central tendency.
What is meant by partition values?
The value which divide the series into more them two equal parts is known as the partition value.
What are the characteristics of a good table?
It should be simple, compact, complete and self explanatory.
Sponsor Area
Sponsor Area
Sponsor Area