R Programming

SQL Intermediate

1) Introduction This a set of exercises that around the SQL Intermediate course that I’ve studied at DataCamp (which I really recommend!). In fact, the databases used there was different that I’ve used here. In other to accomplish the goals of practice as you study, I studied a lot of new concepts of SQL language and I’ve tried to use in a different environment (RStudio) and a new dataset. I hope you enjoy a lot as I do!

Continue reading

PostgreSQL and RStudio

1) Introduction It is so straightforward working with the IDE RStudio (in my opinion, one of the most relevant and easy to use) and a connection with a DataBase (in this case we are using PostgreSQL). Check it out in the lines below one way (of course there are other options) to connect. Enjoy it! 2) Packages we are using DBI dplyr odbc # 2) Important packages ---- library(DBI) library(dplyr) library(odbc) 3) Checking out the data sources available This is one important step that you have to check if the driver that you want was installed in your machine.

Continue reading

Modeling with tidymodels in R

1) Machine Learning with tidymodels In this chapter, you’ll explore the rich ecosystem of R packages that power tidymodels and learn how they can streamline your machine learning workflows. You’ll then put your tidymodels skills to the test by predicting house sale prices in Seattle, Washington. 1.1) Tidymodels packages tidymodels is a collection of machine learning packages designed to simplify the machine learning workflow in R. In this exercise, you will assign each package within the tidymodels ecosystem to its corresponding process within the machine learning workflow.

Continue reading

Modeling with tidymodels in R

1) Machine Learning with tidymodels In this chapter, you’ll explore the rich ecosystem of R packages that power tidymodels and learn how they can streamline your machine learning workflows. You’ll then put your tidymodels skills to the test by predicting house sale prices in Seattle, Washington. 1.1) Tidymodels packages tidymodels is a collection of machine learning packages designed to simplify the machine learning workflow in R. In this exercise, you will assign each package within the tidymodels ecosystem to its corresponding process within the machine learning workflow.

Continue reading

Credit Card Fraud Detection

Objective Our goal is to train a Neural Network to detect fraudulent credit card transactions in a dataset referring to two days transactions by european cardholders. Source: https://www.kaggle.com/mlg-ulb/creditcardfraud/data Data credit = read.csv(path) The datasets contains transactions made by credit cards in September 2013 by european cardholders. This dataset presents transactions that occurred in two days. As we can see, this dataset consists of thirty explanatory variables, and a response variable which represents whether a transation was a fraud or not.

Continue reading

Data Frame

Data Frame This format is usually used when the information is not contained in just one dimension (vector) Example product <- c("Product A", "Product B", "Product C", "Product D", "Product E") price <- c(5, 15, 4, 6, 8) table_price_product <- data.frame(product, price) table_price_product ## product price ## 1 Product A 5 ## 2 Product B 15 ## 3 Product C 4 ## 4 Product D 6 ## 5 Product E 8 Indexing Access the D Product in the Products Table:

Continue reading