Data Set
https://www.kaggle.com/datasets/mirichoi0218/insurance
Please use the necessary graphs to answer the following questions by using the Plotly library and state your observations in a couple of sentences for each question.
- What is the general outlook and statistics of the data?
- a. Data types
- b. Number of rows and columns
- c. Description of attributes
- d. Statistical outlook
- e. NaN values
Tip: You do not need to use graphs to answer this question.
- What is the frequency of categorical variables? Is there any significant difference in frequencies?
Tip: You can use bar chart and pie chart with subplots.
- What is the dispersion of medical costs regarding the categorical variables? Is there any observation that stands out? Do you see any significant change if you add a third categorical variable?
Tip: You can use bar chart and compare the total and average values of numerical variables across the categorical variables.
- How is the distribution of numerical variables?