Home > Publications > Article 8
Home > Publications > Article 8
Online Article
12th January 2023
Related topic: Quantitative research
Author: Ady Hameme N. A.
Exploratory data analysis (EDA) is a crucial step in the data analysis process, where analysts seek to understand the patterns, trends, and relationships within a dataset. It involves summarizing the data, visualizing it in various forms, and identifying any anomalies or outliers. EDA is an iterative process, and its purpose is to uncover insights and inform the development of hypotheses for further testing.
According to prominent textbooks, EDA is a crucial first step in data analysis because it helps identify the most important aspects of the data, as well as any potential problems or biases (Tukey, 1977). It also allows analysts to develop an understanding of the data and its underlying structure, which can inform subsequent statistical analyses (Cleveland, 1994).
One of the key tools in EDA is visualization, which allows analysts to explore and communicate the patterns in the data (Wilkinson, 1999). Visualizations can take many forms, including bar charts, scatter plots, and box plots, and they can reveal patterns such as trends, clusters, and relationships between variables. Additionally, visualization can help identify anomalies or outliers in the data, which may indicate errors or problems with the data collection process (Tufte, 1983).
Another important aspect of EDA is summarization, which involves reducing the data to a form that is more manageable and easier to understand (Tukey, 1977). This can be achieved through techniques such as aggregation, aggregation, and sampling. Summarization is important because it allows analysts to focus on the most important aspects of the data and highlight patterns or trends that may not be apparent in the raw data.
In conclusion, EDA is a crucial first step in the data analysis process, and it involves visualizing and summarizing the data to identify patterns, trends, and anomalies. According to classical prominent textbooks, EDA is essential for developing an understanding of the data and informing subsequent statistical analyses.
Cite this article: Ady Hameme, N. A. (2023, January 12). Introduction to Exploratory Data Analysis (EDA). Retrieved <insert month> <insert date>, <insert year>, from https://www.myadvrc.com/publications/article-8
References
Cleveland, W. S. (1994). The elements of graphing data. Monterey, CA: Wadsworth & Brooks/Cole.
Tukey, J. W. (1977). Exploratory data analysis. Reading, MA: Addison-Wesley.
Tufte, E. R. (1983). The visual display of quantitative information. Cheshire, CT: Graphics Press.
Wilkinson, L. (1999). The grammar of graphics. New York: Springer.
Header photo by Zukiman Mohamad. For illustration purposes only.