Data Visualization on Pima Indian Diabetes Dataset

Triumph Ogeh
3 min readJul 25, 2020

--

This visualizations show the various features that can determine if a patient will be diagnosed with diabetes or not based on the dataset. The data is available here on Kaggle

Let’s describe the features used in these visualizations:

  • Pregnancy: the number of pregnancies a patient has had.
  • BMI: Body mass index, this is the ratio of a patients mass(kg) to the square of the patients height. BMI value Below 18.5 is underweight, between 18.5 – 24.9 is a normal or healthy weight, between 25.0 – 29.9 is overweight, greater than 30.0 is obese.
  • Insulin: insulin is a hormone that helps control the body’s blood sugar(glucose) level.
  • Glucose: glucose is the main type of sugar in the blood.
  • Skin Thickness: Skin thickness is primarily determined by collagen content, skin thickness increases due to the duration of diabetes.

Figure 1

Figure 1 above shows that patients with glucose levels from zero to 130 are less likely to be diagnosed with diabetes, infact from the graph patients with glucose levels of 0 to 100 were rarely diagnosed with diabetes even though their insulin levels was relatively low while patients with glucose levels above 130 were more likely to be diagnosed with diabetes.

Figure 2

Figure 2 shows patients with their number of pregnancies, it is visible that patients with number of pregnancies below 7 are less likely to be diagnosed with diabetes. That is, the higher the number of pregnancies the higher the chances are of being diagnosed with diabetes.

Figure 3

Figure 3 shows a relationship between skin thickness and BMI, from the graph we can deduce that patients with low BMI(underweight) and high skin thickness are most likely to be diagnosed with diabetes. If you look at the zero line on the graph you will notice that it has an increasing value for skin thickness and patients that fell in that category were all diagnosed with diabetes.

Figure 4

Figure 4 above shows that patients with glucose levels below 120 and BMI values a little about 20(healthy/normal weight) rarely have diabetes. That is a patient high BMI value and high Glucose level is most likely to be diagnosed with diabetes.

Conclusions

From the visualizations above it is reasonable to conclude that:

  • High glucose level indicates diabetes.
  • People with high numbers of pregnancies are most likely to have diabetes.
  • Underweight people(very low BMI) are also at risk of getting diabetes if their skin thickness is relatively high.
  • Overweight people(high BMI) are at a higher risk of getting diabetes.

‘‘Please note that these conclusions were solely derived from the data Visualizations above.’’

Thanks for reading!

--

--

Triumph Ogeh
Triumph Ogeh

Written by Triumph Ogeh

Analyzing Data and Building Products

No responses yet