Data in Statistical Studies
Statistics is concerned with reliable numerical characteristics of the world, and begins with data preparation. Data and their analyses are often reproduced for the verification of an original study, and it is crucial to keep vital information related to the design of data collection process. Firstly the type of study and the data collection process must be specified:
- In a survey we gather data on existing conditions, attitude, or behaviors by ratings in a form of questionnaire or direct observations, and demographic information such as age, sex, income.
- In an observational study data are collected via “observation” based on the existing conditions which are not controlled in the sense of proper scientific experiment.
- In a scientific study (or an experimental study) an experimenter controls “experiments” and various conditions of measurement (e.g., temperature and other quantitative qualities).
A data set is a collection of observations on one or more variables recorded as an outcome of experiment or survey. It is very important to envisage a population from which the data are drawn, and to consider the data as a random sample which is representative of the underlying population. Each variable possesses a different characteristic such as household income, metal temperature, cell diameter, seed type, fertilizer level, and so on.
- A variable is called categorical or qualitative when data are recorded as several categories. For example we may observe the different types such as “round yellow”, “wrinkled yellow”, “round green”, or “wrinkled green” in the seed type of progeny in plant breeding.
- A variable is called quantitative or numerical when data are numerical values such as 0.223 or 152.7.
© TTU Mathematics