(org-show-animate '("Quantitative Methods" "Descriptive Statistics" "Vikas Rawal" "Prachi Bansal" "" "" ""))
Frequency
Measures of central tendency
Summary positions
Measures of dispersion
library(data.table )
data.table(names = c(" Anil" ," Neeraj" ," Savita" ," Srimati" ,
" Rekha" ," Pooja" ," Alex" ," Shahina" ,
" Ghazal" ," Lakshmi" ," Rahul" ," Shahrukh" ,
" Naman" ," Deepak" ," Shreya" ," Rukhsana"
),
salary = c(71 ,50 ,65 ,40 ,
45 ,42 ,46 ,43 ,
45 ,43 ,45 ,45 ,
850 ,100 ,46 ,48
)* 1000 ,
sex = c(" M" ," M" ," F" ," F" ,
" F" ," F" ," M" ," F" ,
" F" ," F" ," M" ," M" ,
" M" ," M" ," F" ," F"
))- > workers
workers $ sno <- c(1 : nrow(workers ))
workers [,.(sno ,names ,sex ,salary )]
sno names sex salary
1 Anil M 71000
2 Neeraj M 50000
3 Savita F 65000
4 Srimati F 40000
5 Rekha F 45000
6 Pooja F 42000
7 Alex M 46000
8 Shahina F 43000
9 Ghazal F 45000
10 Lakshmi F 43000
11 Rahul M 45000
12 Shahrukh M 45000
13 Naman M 850000
14 Deepak M 1e+05
15 Shreya F 46000
16 Rukhsana F 48000
workers [,.(frequency = length(sno )),.(sex )]
Measures of Central Tendency
workers [,.(mean_salary = round(mean(salary ),1 ),
median_salary = quantile(salary ,prob = 0.5 ))]
mean_salary median_salary
101500 45500
workers [,.(mean_salary = round(mean(salary ),1 ),
median_salary = quantile(salary ,prob = 0.5 )),.(sex )]
sex mean_salary median_salary
M 172428.6 50000
F 46333.3 45000
First quartile
Second quartile (median)
Third quartile
Deciles
Quintiles
Percentiles
Range and other measures based on positions
$range=max-min$
min_salary max_salary range
40000 850000 810000
workers [,.(min_salary = min(salary ),
max_salary = max(salary ),
range = max(salary )- min(salary ))]
Range and other measures based on positions
Distance between any two positions (Deciles, Quintiles, Percentiles) can be used as a measure of dispersion.
$inter.quartile.range=Q3-Q1$
# # summary(workers$salary)
quantile(workers $ salary ,probs = c(0.25 ,0.75 ))
quantile(workers $ salary ,probs = c(0.1 ,0.9 ))
quantile(workers $ salary ,probs = c(0.1 ,0.95 ))
quantile(workers $ salary ,probs = c(0.25 ,0.95 ))
quantile(workers $ salary ,probs = c(0 ,0.75 ))
Variance, Standard Deviation and Coefficient of Variation
$variance=\frac{1}{n} × ∑(xi -x)2 $
$standard.deviation = \sqrt{variance}$
$cov=\frac{standard.deviation}{mean}$
workers [,.(var_salary = round(var(salary ),1 ),
sd_salary = round(sqrt(var(salary )),1 ),
cov_salary = round(sqrt(var(salary ))/ mean(salary ),2 ))
]
var_salary sd_salary cov_salary
40075200000 200187.9 1.97
students [,.(var_salary = round(var(salary ),1 ),
sd_salary = round(sqrt(var(salary )),1 ),
cov_salary = round(sqrt(var(salary ))/ mean(salary ),2 )),.(sex )]
sex var_salary sd_salary cov_salary
M 89680952381 299467.8 1.74
F 54500000 7382.4 0.16
Graphical Displays of Quantitative Information: Dispersion
Histogram with relative densities
Invented by John Tukey in 1970
Many variations proposed since then, though the essential form and idea as remained intact.
Boxplots: Useful to identify extreme values
Boxplots: Useful for comparisons across categories
Graphical Displays of Quantitative Information: Common Pitfalls
Common uses of statistical graphics
To show trends over time
To show mid-point variations across categories
To show composition
(less commonly, though more usefully) to show/analyse dispersion
Mis-representation: illustrations from Thomas Piketty’s work (source Noah Wright)
Mis-representation: illustrations from Thomas Piketty’s work (source Noah Wright)
Mis-representation: illustrations from Thomas Piketty’s work (source Noah Wright)
Mis-representation: illustrations from Thomas Piketty’s work (source Noah Wright)
The problem multiplied with the coming in of spreadsheets
Paul Krugman on Fiscal Austerity
What does this graph show?
Source: https://www.nytimes.com/2018/11/02/opinion/the-perversion-of-fiscal-policy-slightly-wonkish.html
What did Paul Krugman say?
“Here’s what fiscal policy should do: it should support demand when the economy is weak, and it should pull that support back when the economy is strong. As John Maynard Keynes said, “The boom, not the slump, is the right time for austerity.” And up until 2010 the U.S. more or less followed that prescription. Since then, however, fiscal policy has become perverse: first austerity despite high unemployment, now expansion despite low unemployment.
How could we better show the relationship between unemployment and fiscal austerity