Skip to content

Latest commit

 

History

History
98 lines (69 loc) · 1.67 KB

r.md

File metadata and controls

98 lines (69 loc) · 1.67 KB

R and Rstudio installtion in ubuntu 20.04

sudo apt install r-base
sudo apt install r-cran-rstan r-cran-tidyverse
wget https://download1.rstudio.org/electron/jammy/amd64/rstudio-2023.06.2-561-amd64.deb
sudo apt install -f ./rstudio-2023.06.2-561-amd64.deb
rstudio

rstudio read data from csv file

df <- read.csv("~/path_to_file_.csv")

Show csv schema and explore

columns & schema

tibble::glimpse(df)

schema

Summary statistics

Remove all missing values from dataframe

Remove specific comumns from the dataframe

df2 <- subset(df, select = -c(?column1, column2,....,columnN))

Filter Operations

Filter Rows by a column value

filter(df, ?column == '?value')

Filter Rows by list of column Values

filter(df, ?column %in% c('?val1','?val2','?val3'))

Filter Rows by Checking values on Multiple Columns

filter(df, ?column1 == '?value1' & ?column2 >?value2)

Filter DataFrame by column name column2 and column3.

subset(df,?column1 == '?value',select = c('column2','column3'))

Correlation calcualtion and plot

Install the ggcorrplot package to use for the correlation calculations and plot

install.packages("ggcorrplot")

calculate the correlation and store in cors as a matrix with numbers

cors <- cor(df, user = "pairwise.complete.obs")

store the correlation plot matrix in a gg element ``

reference: http://fhollenbach.org/OLD_polisci209_DONOTUSE/img/images/notes-18-correlation-r.pdf https://cran.r-project.org/web/packages/ggcorrplot/readme/README.html

r connect to SQL postgresql