Machine Learning Methods for Breast Cancer Diagnosis

Authors

  • Matthew Lee Kristin School
  • Zhaonan Sun Ingenius Prep

DOI:

https://doi.org/10.47611/jsrhs.v11i3.2676

Keywords:

Machine Learning, Neural Networks, Logistic Regression, KNN, Breast Cancer, Disease Diagnosis

Abstract

As many modern diseases begin to surface especially as of late, such as the Ebola and COVID-19 epidemics, scientists have begun developing new and innovative tactics to combat them. While new medicine and vaccines may be developed, one area that needs special attention is the diagnosis of diseases – this is because without a proper and speedy diagnosis, scientists wouldn’t be able to detect diseases, rendering treatment ineffective. Scientists have begun using machine learning algorithms to help ensure an accurate and speedy diagnosis. One specific disease that has seen frequent testing around machine learning diagnosis is breast cancer. Breast cancer is one of the deadliest and common cancers around the world for women, and due to its effects, the doctrine of speed in diagnosis is essential. This study will attempt to find out, out of three machine learning algorithms (neural networks, logistic regression and K-nearest neighbours), which one is the most effective at diagnosing breast cancer using the Wisconsin Breast Cancer Dataset. Results suggest that neural networks perform the best in diagnosing breast cancer, however only by a small margin compared to other results.

Downloads

Download data is not yet available.

Author Biography

Zhaonan Sun, Ingenius Prep

Advisor

References or Bibliography

Boughey, J. C. (n.d.). Breast Cancer: Symptoms and Causes. Mayoclinic. Retrieved April 4, 2022, from https://www.mayoclinic.org/diseases-conditions/breast-cancer/symptoms-causes/syc-20352470

Cancer Facts & Figures 2021. (2021). American Cancer Society. Retrieved April 4, 2022, from https://www.cancer.org/content/dam/cancer-org/research/cancer-facts-and-statistics/annual-cancer-facts-and-figures/2021/cancer-facts-and-figures-2021.pdf

Deaths and Mortality. (2020). Faststats. Retrieved April 4, 2022, from https://www.cdc.gov/nchs/fastats/deaths.htm

Hathaway, B. (2020, January 28). Estimates of preventable hospital deaths are too high, new study shows. YaleNews. Retrieved April 4, 2022, from https://news.yale.edu/2020/01/28/estimates-preventable-hospital-deaths-are-too-high-new-study-shows

Kumar, M., & Choi, M. (Eds.). (n.d.). Breast Cancer Wisconsin (Diagnostic) Data Set. Kaggle. Retrieved April 4, 2022, from https://www.kaggle.com/datasets/uciml/breast-cancer-wisconsin-data?select=data.csv

Potdar, K. (2016, September). A Comparative Study of Machine Learning Algorithms applied to Predictive Breast Cancer Data. Retrieved April 4, 2022, from https://www.researchgate.net/publication/308725638_A_Comparative_Study_of_Machine_Learning_Algorithms_applied_to_Predictive_Breast_Cancer_Data

Sharma, A., Kulshrestha, S., & Daniel, S. (2017, December). Machine learning approaches for breast cancer diagnosis and prognosis. Retrieved April 4, 2022, from https://www.researchgate.net/publication/322944323_Machine_learning_approaches_for_breast_cancer_diagnosis_and_prognosis

U.S. Breast Cancer Statistics. (n.d.). Breastcancer.org. Retrieved April 4, 2022, from https://www.breastcancer.org/facts-statistics

Yildirim, S. (2020, March 1). K-Nearest Neighbors (kNN) — Explained. Towards Data Science. https://towardsdatascience.com/k-nearest-neighbors-knn-explained-cbc31849a7e3

Published

08-31-2022

How to Cite

Lee, M., & Sun, Z. (2022). Machine Learning Methods for Breast Cancer Diagnosis. Journal of Student Research, 11(3). https://doi.org/10.47611/jsrhs.v11i3.2676

Issue

Section

HS Research Articles