An Introduction to Data types in Statistics
Data types otherwise known as Measurement Scales are an important concept as statistical methods can only be used when dealing with certain data types. One of the pre-requisites for doing Exploratory Data Analysis (EDA) is the understanding of different data types as we can use only certain statistical measurements for specific data types. It is also required when we are choosing the right kind of Visualization Techniques. Measurement Scales can broadly be classified as Categorical Data and Numerical Data.
Categorical Data basically represents characteristics. It represents gender classification, language etc. These types of data can also take on numerical values i.e., 1 for female and 0 for male, but those numbers do not have any mathematical meaning. Categorical data can also be classified into two: Nominal Data and Ordinal Data. Nominal Data are the ones those are represented by discrete units and are used for labelling variables which have no quantitative value which means they are more or less, a Label. They do not have any order. Ordinal Data represents a discrete and ordered unit which means they are same as nominal values. But, the ordering matters in this situation.
Numerical Data can be classified as Discrete Data and Continuous Data. When we speak of Discrete data, it means they are discrete and separate, i.e., data can only take on certain values. These can be counted not measured. For example, number of heads popped up in coin tosses. Continuous Data represents measurements and their values cannot be counted but can be measured. An example would be the weight or height of a person, which we can describe by using intervals that are on a real number line.
Continuous Data can also be classified into Interval Data and Ratio Data. Interval values represent ordered units that have the same difference, i.e., when we have a variable that contains numeric values that are ordered and have exact differences between the values. When dealing with interval data, we can do addition and subtraction, but we can multiply, divide and even calculate ratios. Since there is no true zero, descriptive and inferential statistics can’t be applied in such cases. Ratio values are similar to interval values, the difference is that they do have an absolute zero. Height and Weight of a person can be taken as a good example in this case.
We will need to analyze continuous data differently than categorical data otherwise it would result in a wrong analysis. Thus, knowing the types of data we are dealing with, enables us to choose the correct method of analysis. You can become more familiar with the entire concepts and practice more through Data Science training institute in Kochi. Tremendous opportunities to guide you into the right kind of training is present to understand the lifecycle of a Data Science project which can be availed by the extensive course provided by the Data Science courses in Kerala.