You are expected to do the following: 1.Explore the dataset and extract insights using Exploratory Data Analysis.2.Prove (or disprove) that the medical claims made by the people who smoke is greater than those who don’t? [Hint- Formulate a hypothesis and prove/disprove it]3. Prove (or disprove) with statistical evidence that the BMI of females is different from that of males.4. Is the proportion of smokers significantly different across different regions? [Hint : Create a contingency ta

You are expected to do the following:
1.Explore the dataset and extract insights using Exploratory Data Analysis.2.Prove (or disprove) that the medical claims made by the people who smoke is greater than those who don’t? [Hint- Formulate a hypothesis and prove/disprove it]3. Prove (or disprove) with statistical evidence that the BMI of females is different from that of males.4. Is the proportion of smokers significantly different across different regions? [Hint : Create a contingency table/cross tab, Use the function : stats.chi2_contingency()]5. Is the mean BMI of women with no children, one child, and two children the same? Explain your answer with statistical evidence.*Consider a significance level of 0.05 for all tests.
Context – Leveraging customer information is of paramount importance for most businesses. In the case of an insurance company, attributes of customers like the ones mentioned below can be crucial in making business decisions. Hence, knowing to explore and generate value out of such data can be an invaluable skill to have.
Data Dictionary –
Age – This is an integer indicating the age of the primary beneficiary (excluding those above 64 years, since they are generally covered by the government).Sex – This is the policy holder’s gender, either male or female.BMI – This is the body mass index (BMI), which provides a sense of how over or under-weight a person is relative to their height. BMI is equal to weight (in kilograms) divided by height (in meters) squared. An ideal BMI is within the range of 18.5 to 24.9.Children – This is an integer indicating the number of children / dependents covered by the insurance plan.Smoker – This is yes or no depending on whether the insured regularly smokes tobacco.Region – This is the beneficiary’s place of residence in the U.S., divided into four geographic regions – northeast, southeast, southwest, or northwest.Charges? – Individual medical costs billed by health insurance
Please note the following:
There are two parts to the submission:A well commented Jupyter notebook [format – .ipynb]A presentation as you would present to the top management [format – .ppt /.pptx]

Reference no: EM132069492

GET HELP WITH YOUR PAPERS

GET THIS ANSWER FROM EXPERTS NOW

WhatsApp
Hello! Need help with your assignments? We are here
Don`t copy text!