Statistic

Credit Card Fraud Detection

Objective Our goal is to train a Neural Network to detect fraudulent credit card transactions in a dataset referring to two days transactions by european cardholders. Source: https://www.kaggle.com/mlg-ulb/creditcardfraud/data Data credit = read.csv(path) The datasets contains transactions made by credit cards in September 2013 by european cardholders. This dataset presents transactions that occurred in two days. As we can see, this dataset consists of thirty explanatory variables, and a response variable which represents whether a transation was a fraud or not.

Continue reading

Correlation and Regression

path <- "C:/Users/andre/OneDrive/Área de Trabalho/salerno/blogdown/datasets/ncbirths" path <- paste0(path, "/ncbirths.csv") data <- read.csv(path, stringsAsFactors = FALSE) dim(data) ## [1] 1450 15 names(data) ## [1] "ID" "Plural" "Sex" "MomAge" ## [5] "Weeks" "Marital" "RaceMom" "HispMom" ## [9] "Gained" "Smoke" "BirthWeightOz" "BirthWeightGm" ## [13] "Low" "Premie" "MomRace" library(ggplot2) ggplot(data = data, aes(y = BirthWeightOz, x = Weeks)) + geom_point() ## Warning: Removed 1 rows containing missing values (geom_point). # Boxplot of weight vs.

Continue reading

Diagnosing breast cancer with the kNN algorithm

1 - Introduction Could the Machine Learning Algorithms detect beforehand any abnormal cell process? We know that this clinical battle is not so easy and there are a lot of people envolved in this process trying to identify a clear path to the cure. In complement to the decision human process, coult the technology decrease the subjective bias inherently in the process and improve our decisions? We absolutely know that the human being process is limited when compared to high capacity of the computers.

Continue reading