Artykuł w wersji angielskiej w formacie PDF

Transkrypt

Artykuł w wersji angielskiej w formacie PDF
Strona |1
Nauki ścisłe priorytetem społeczeństwa opartego na wiedzy
Artykuły na platformę CMS
Dr Anna Rybak
Institute of Computer Science
University in Białystok
Ways of presenting problems in statistics
Introduction
In the following article the basics of descriptive statistics, which deals with describing,
demonstrating in various forms and analysing the results of researches conducted on a random
sample are presented. The methods of arranging and visualisation of statistical data (frequency
distribution, histograms and other diagrams) as well as descriptive statistics and examining the
correlation between statistical features have been demonstrated. These issues go beyond the core
curriculum in mathematics, but they are not difficult and together with the help of appropriate
computer visualisation a student can comprehend them easily.
Problem
A survey involving the sample of 40 students was conducted. Each of the students answered the
question: „How many books did you read last month?”
Here are the answers of the students:
5, 1, 2, 0, 5, 4, 4, 1, 1, 1, 2, 0, 0, 0, 3, 1, 1, 2, 5, 4, 6, 4, 0, 1, 2, 3, 5, 2, 1, 2, 3, 0, 2,
4, 3, 2, 2, 3, 0, 1.
What can be inferred about reading in this group of youth?
Theoretical introduction
Statistics is a branch of mathematics, which deals with statistical deduction, which means
formulating and verifying general conclusions (statistical hypotheses) on the basis of a finite
results number concerning random observations.
While carrying out statistical research of a certain collectivity (population) its representative group
called sample is chosen. The sample is the subject of direct examination, and the results are
generalized to the whole population.
The phenomenon being examined is called statistical feature (the name „variable” may also be
used – it is commonly used in software concerning statistics), and the results of a research
conducted on a sample – feature values.
The reliability of such research depends mainly on the sample chosen.
Statistics is divided into two branches:
descriptive statistics, which deals with elaborating, presenting in different forms and
analysing the results of research conducted on a random sample
and
mathematical statistics, which deals with concluding concerning feature values
arrangement
Projekt współfinansowany przez Unię Europejską
w ramach Europejskiego Funduszu Społecznego
Strona |2
Nauki ścisłe priorytetem społeczeństwa opartego na wiedzy
Artykuły na platformę CMS
in the whole population depending on results of sample research.
Results obtained from the research on a sample may be demonstrated in various graphical ways
(charts, various diagrams) and analysed with so called numerical statistics.
In mass-media the effects of applying methods of descriptive statistics to visualise results of
research concerning various social, political, economic, cultural and other phenomena are very
often observed.
The proper software is very useful while elaborating statistical data (especially while producing
diagrams and carrying out complex mathematical operations).
A program which will be used when considering statistical issues is Statistics and Probability
(Statystyka i prawdopodobieństwo). Its English demo version may be downloaded from
www.vusoft2.nl (file vustatengdemo.zip). Any spreadsheet may also be used.
Examination of the issue
In the problem presented in the introduction all students of a certain school may be taken as a
population. The research sample comprises forty chosen students, and the feature examined is the
number of books read.
The question stated:
” What can be said about reading in the group of youth?”
should be more detailed. Which specific questions may be asked in order to obtain as much as
possible information describing the phenomenon in possibly the most exact way?
First of all it should be noticed that data are not arranged so they are not readable. Any activity of
analysing and concluding is hindered.
With the help of a program Statistics and Probability (Statystyka i prawdopodobieństwo) data
may be arranged in the form of a chart called number board (table), using options
Statistics/Tables/Number board.
The table is demonstrated below:
No of
books
0-1
2-3
4-5
6-7
Total
No
16
14
9
1
40
The left column includes so called classes of feature values. Such data grouping occurs when the
Projekt współfinansowany przez Unię Europejską
w ramach Europejskiego Funduszu Społecznego
Strona |3
Nauki ścisłe priorytetem społeczeństwa opartego na wiedzy
Artykuły na platformę CMS
number of observations is high. If we want to have all values of the feature examined (that means
the number of books read) enlisted in the left table column, we use the recently used option with
the button Classes stating the number of classes as 7 (it is the number of values of the feature
examined):
No of
books
0
1
2
3
4
5
6
Total
No
7
9
10
4
5
4
1
40
If the field Percentage is marked, the following table will be obtained:
No of
books
0
1
2
3
4
5
6
Total
No
%
7
9
10
4
5
4
1
40
17,50
22,50
25,00
10,00
12,50
10,00
2,50
100%
Depending on the above tables what questions can be asked? For instance:
How many books did the most students read?
What percentage is it of the number of students examined?
How many books did the least students read?
What percentage i sit of the number of students examined?
Are there students who don’t read at all? How many of them are there? What percentage is
it of the group of students examined?
Answer the above questions. Maybe you notice other issues that could be asked about?
Tabular data arrangement is not the most graphic. The data will be demonstrated in the form of
different types of diagrams. It can be done using the option Statistics/Diagrams and choosing an
appropriate type of diagram or Show all. The option Data diagrams from main Menu may also be
Projekt współfinansowany przez Unię Europejską
w ramach Europejskiego Funduszu Społecznego
Strona |4
Nauki ścisłe priorytetem społeczeństwa opartego na wiedzy
Artykuły na platformę CMS
used.
In such a way a bar graph including the number of books read may be obtained:
Or
including
the
in general reading:
percentage
of
individual
numbers
of
books
read
What conclusions can you draw on the basis of the graph?
It is clearly visible that most students tested read less books (0, 1 or 2).
It is also possible to prepare a pie diagram which is extremely useful while examining the structure
of a phenomenon:
Examining the phenomenon of reading in students could be more thorough if the students tested
were asked about different features. The research can be extended so that reading can be seen in
the context of other features. To do that students are asked about sex and the mark in Polish.
Projekt współfinansowany przez Unię Europejską
w ramach Europejskiego Funduszu Społecznego
Strona |5
Nauki ścisłe priorytetem społeczeństwa opartego na wiedzy
Artykuły na platformę CMS
The sex designations are assigned as follows:
The number 0 symbolises male, 1 – female.
What questions can be now asked to the issue? For instance:
Who reads more books: males or females?
Who has got better marks in Polish: males or females?
Is there any connection (in statistics it is called correlation) between the mark in Polish and
the number of books read?
In order to answer the questions concerning the difference of reading in males and females the
data will be divided into two groups according to gender variable. The option
Statistics/Data/Divide will be used, and the variable sex as a division variable will be chosen. The
division into two classes have been made: males and females. It is reflected in the number table:
sex
No of
books
0
1
2
3
4
5
6
Total
sex 0
No
sex 1
%
6
9
4
3
1
0
1
24
25,00
37,50
16,67
12,50
4,17
0,00
4,17
100%
No
%
1
0
6
1
4
4
0
16
Total
6,25
0,00
37,50
6,25
25,00
25,00
0,00
100%
7
9
10
4
5
4
1
40
as well as in the graphs whose forms we can choose:
It is possible to evaluate which graph is more readable and convenient as far as making
Projekt współfinansowany przez Unię Europejską
w ramach Europejskiego Funduszu Społecznego
Strona |6
Nauki ścisłe priorytetem społeczeństwa opartego na wiedzy
Artykuły na platformę CMS
comparisons is concerned.
The above graphs illustrate the reading level in males and females.
If you want to demonstrate Polish grades, you must pick the proper variable before making a
graph. Then you may obtain such a graph:
In order to examine if there is any correlation between the Polish grade and the number of books
read it is possible to use the option Statistics/Diagrams/Scatterplot.
On the basis of graph and coefficients placed in the table above the diagrams we can conclude if
there is a correlation between the variables. If (in the case of a linear model) points on the graph
are arranged near the straight line, there is a correlation. The presence of the correlation is also
deducted according to correlation coefficient, which is a number of a range -1, 1 . The higher is
absolute value of this coefficient, the bigger is correlation.
Projekt współfinansowany przez Unię Europejską
w ramach Europejskiego Funduszu Społecznego
Strona |7
Nauki ścisłe priorytetem społeczeństwa opartego na wiedzy
Artykuły na platformę CMS
Data of a certain research have been elaborated and analysed on a graph. The extension of the
research has been planned so that the phenomenon examined (reading level) could be analysed
more comprehensively.
Conclusions:
Reading books in students has not been well developed.
Females read more books than males.
Females have better grade in Polish than males.
There is a strong correlation between the grade in Polish and the number of the books
read, and it is higher in males than females.
The conclusions comprise a generalization concerning the whole population, and are drawn on the
basis of group observation. These are statistical hypotheses, with the exception of the first
conclusion which has been very inaccurately formed – we do not know what it means in
mathematical sense: „reading has not been well developed”. A statistical hypothesis may be
verified (check whether it is right or wrong) using the methods of mathematical statistics. Also the
program Statistics and probability (Statystyka i prawdopodobieństwo) can help with that. It is
not, however, an issue at a secondary school.
Projekt współfinansowany przez Unię Europejską
w ramach Europejskiego Funduszu Społecznego

Podobne dokumenty