Posts

Final Project - Matthew Luu

Image
 Average Real Estate Sale Amount Analysis Between 2001 and 2020 in the State of Connecticut Throughout the years the real estate market has had many ups and downs, going through an absurd number of economic depressions between the years 2001 and 2020. My goal with this project is to look through the average sale prices of real estate in the State of Connecticut to see if there has been a significant increase in the overall selling price of real estate within those years.  I obtained the data for this project from  Dataset - Real Estate Sales 2001-2020 GL , which was provided by the State of Connecticut and its Office of Policy and Management. The dataset itself encompasses all real estate sales with a sales price of $2,000 or greater that occur between October 1 and September 30 of each year. For each sale record, the file includes the town, property address, date of sale, property type (residential, apartment, commercial, industrial or vacant land), sales price...

Module 12 Assignment

Image
 a.  Construct a time series plot using R #a. Construct a time series plot  # Load necessary library library(ggplot2) # Create a data frame df <- data.frame(   Month = factor(c(rep(c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"), 2)), levels = c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")),   Year = c(rep(2012, 12), rep(2013, 12)),   Charge = c(31.9, 27, 31.3, 31, 39.4, 40.7, 42.3, 49.5, 45, 50, 50.9, 58.5, 39.4, 36.2, 40.5, 44.6, 46.8, 44.7, 52.2, 54, 48.8, 55.8, 58.7, 63.4) ) # Create a time series plot ggplot(df, aes(x = interaction(Year, Month, lex.order = TRUE), y = Charge, group = Year, color = as.factor(Year))) +   geom_line() +   theme(axis.text.x = element_text(...

Module 11 Assignment

> # Ensure the ISwR package is installed and loaded > if (!require(ISwR)) { + install.packages("ISwR") + library(ISwR) + } > > # Load the ashina data > data(ashina) > > # Convert the subject to a factor > ashina$subject <- factor(1:16) > > # Create the 'act' and 'plac' data frames > act <- data.frame(vas = ashina$vas.active, subject = ashina$subject, treat = 1, period = ashina$grp) > plac <- data.frame(vas = ashina$vas.plac, subject = ashina$subject, treat = 0, period = ashina$grp) > > # Define the function 'g1' > g1 <- function(x, y, z) { + seq(from = x, to = y, length.out = z) + } > > # Define your variables > a <- g1(2, 2, 8) > b <- g1(2, 4, 8) > x <- 1:8 > y <- c(1:4, 8:5) > z <- rnorm(8) > > # Generate the model matrices > model_ab <- model.matrix(~ a*b) > model_a_b <- model.matrix(~ a:b) > > # Fit the models > fit_a...

Module 10 Assignment

Here is the code used to conduct ANOVA and regression analysis on the "cystfibr" and "secher" datasets in R Studio: # Load necessary libraries library(datasets) # Load the cystfibr dataset data("cystfibr") # Fit a linear model to the data model_cystfibr <- lm(spemax ~ age + weight + bmp + fev1, data=cystfibr) # Display the summary of the model summary(model_cystfibr) # Conduct ANOVA on the model anova(model_cystfibr) # Load the secher dataset data("secher") # Log-transform birth weight and abdominal diameter secher$log_bwt <- log(secher$bwt) secher$log_ad <- log(secher$ad) # Fit a linear model to the data model_secher <- lm(log_bwt ~ log_ad, data=secher) # Display the summary of the model summary(model_secher) In the cystfibr dataset, we’re fitting a linear model where spemax is predicted by age, weight, bmp, and fev1. The coefficients of these variables in the model represent their respective effects on spemax. The intercept is the e...

Module 9 Assignment

This weeks assignment is based around two questions, the first is to generate a simple table in R that consists of four rows: Country, age, salary, and purchased. The following code is used to generate that. # 1. Simple table assignment_data <- data.frame ( Country = c ( "France" , "Spain" , "Germany" , "Spain" , "Germany" , "France" , "Spain" , "France" , "Germany" , "France" ) , age = c ( 44 , 27 , 30 , 38 , 40 , 35 , 52 , 48 , 45 , 37 ) , salary = c ( 6000 , 5000 , 7000 , 4000 , 8000 ) , Purchased = c ( "No" , "Yes" , "No" , "No" , "Yes" , "Yes" , "No" , "Yes" , "No" , "Yes" ) ) print ( assignment_data ) The second is used to generate a contingency table known as a rx C table u...

Module 8 Assignment

Image
Part 1 Using R we have been asked to report on the drug and stress levels in the provided data set, to begin with we must create the vectors for each group. # Create vectors for each group high_stress <- c(10, 9, 8, 9, 10, 8) moderate_stress <- c(8, 10, 6, 7, 8, 8) low_stress <- c(4, 6, 6, 4, 2, 2) Following that, the vectors must be combined into one data frame, stress_data. # Combine the vectors into a dataframe stress_data <- data.frame(   stress_level = factor(rep(c("High", "Moderate", "Low"), each = 6)),   score = c(high_stress, moderate_stress, low_stress) ) An ANOVA test is performed to determine whether there is a significant difference between the groups # Perform the ANOVA test anova_result <- aov(score ~ stress_level, data = stress_data) # Print the summary of the ANOVA test summary(anova_result) This creates the following output: Part 2 To perform an ANOVA test on the zelazo dataset using R we must first load the ISwR package and...

Module 7 Assignment

Image
 1.  The assignment begins with the following dataset x <- c(16, 17, 13, 18, 12, 14, 19, 11, 11, 10) y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)      Following that is defining the relationship model and calculating the coefficients.      Which produces the following output 2. This goes into Part 2, following the question Chi Yau.      This includes parts 2.1-2.3, along with the output 3. Part 3 is based around using the multi regression model and is displayed in the next image.       3.1: The coefficients tell us about the relationship between each predictor variable and the response variable, holding all other predictors constant. 4. Part 4 follows the question from our textbook pp. 110 Exercises # 5.1           With the rmr data set, plot metabolic rate versus body weight. Fit a linear regression to the relation.           According to t...