×
Samples Blogs About Us Make Payment Reviews 4.8/5 Order Now

Predicting House Prices in Hollywood Beach: Data Analysis and Insights

June 13, 2023
Katherine Wilson
Katherine Wilson
🇺🇸 United States
Data Analysis
Katherine Wilson, a proficient data analysis expert with 10+ years' experience, holds a master's from University of Lynchburg. She assists students in completing assignments with expertise and dedication in statistics.
Key Topics
  • Problem Description:
  • Descriptive Statistics:
  • Conclusions:
Tip of the day
Probability is the foundation of statistics. Make sure you are comfortable with concepts like independent and dependent events, conditional probability, and the different probability distributions. This understanding will help you grasp more complex statistical analyses.
News
In 2024, a new survey by the OECD revealed a significant global decline in academic performance, particularly in math and statistics, following the pandemic. While many countries faced setbacks, regions in Asia performed better overall.

Problem Description:

A real estate company wanted to analyze the factors that predict the selling price of houses in the Hollywood Beach neighborhood. To accomplish this, they collected historical data from a sample of 100 houses that were on the market in the past six months. The analysis aimed to determine the key factors influencing house prices. The data included attributes such as square footage, number of bedrooms, age, and days on the market. The data analysis assignment was conducted using the R statistical package, and the results were statistically significant at a 5% level of significance. The study found that square footage, number of bedrooms, age, and days on the market were significant predictor variables, explaining 85.7% of the variation in house selling prices. The number of bedrooms had the most significant positive impact on prices, with a 68% increase for each additional bedroom. Conversely, the age of the house had a negative impact, reducing the selling price by 6.3% for each additional year.

Descriptive Statistics:

Table 1: Descriptives

VariableMeanMaxMin
Selling Price$641,900$1,525,000$189,000
Bedrooms3.3851
Bathrooms2.7841
Days on Market127.801,1882
Age22.43362
Square Feet2,3294,979520
N=100; Missing = 4
Variable Frequency Percentage Frequency
LocationHarbor Islands5454%
West Lake4545%
ForeclosedNo7070%
Yes2929%

From Table 1, we can observe the distribution within the dataset. On average, the houses in Hollywood Beach had a selling price of $641,900, with prices ranging from $189,000 to $1,525,000. These houses had an average of 4 bedrooms and 3 bathrooms, with an average square footage of 2,329 square feet. The houses were, on average, 23 years old, and they spent an average of 128 days on the market. A majority of these houses (54%) were located in Harbor Islands, and most of them (70%) were not foreclosures.

Outlier Detection:

Outliers were identified using boxplots, and the values were treated as missing data. After cleaning the dataset, 95 observations remained for further analysis.

Graph-from-Outlier-Detection

Boxplot

Model Fitting:

A multiple linear regression model was fitted to the data to predict the selling prices based on the selected attributes. The model was selected using a stepwise regression approach with a 5% significance level for variable inclusion. The chosen variables were square feet, bedrooms, age, and days on the market. These variables explained 85.7% of the variation in house selling prices.

Forward selection Variables
Figure 2: Forward selection Variables
Linear regression model
Figure 3: Linear regression model

Conclusions:

The analysis concluded that square footage, number of bedrooms, and days on the market had a positive impact on house selling prices in Hollywood Beach, while the age of the house had a negative impact. Notably, the number of bedrooms had the most significant positive influence, accounting for up to 68% of the price fluctuations. On the other hand, each additional year of age reduced the average selling price by 6.3%.

R-Codes:

# Loading packages install.packages("janitor") install.packages("olsrr") suppressPackageStartupMessages(library(dplyr)) suppressPackageStartupMessages(library(janitor)) suppressPackageStartupMessages(library(tidyverse)) suppressPackageStartupMessages(library(olsrr)) # Data set data - read.csv(file.choose()) view(data) # Data Exploration and cleaning str(data) summary(data) clean - clean_names(data) colnames(clean) clean$location- as.factor(clean$location) clean$foreclosed- as.factor(clean$foreclosed) str(clean) summary(clean) # Checking for outliers boxplot(clean, main="Boxplot Comparison for all variables", xlab="Variables", col = 4) clean_x;- clean %% drop_na() summary(clean_x) for (x in c("selling_price_000","days_on_market")) { value = clean_x[,x][clean_x[,x] %in% boxplot.stats(clean_x[,x])$out] clean_x[,x][clean_x[,x] %in% value] = NA } for (x in c("bedrooms","square_feet")) { value = clean_x[,x][clean_x[,x] %in% boxplot.stats(clean_x[,x])$out] clean_x[,x][clean_x[,x] %in% value] = NA } clean_x- clean % % drop_na() boxplot(clean_x, main="Boxplot Comparison for all variables", xlab="Variables", col = 4) view(clean_x) # Regression model model = lm(selling_price_000 ~ ., data = clean_x) ols_step_forward_p(model, penter = 0.05) Final_Model = lm(selling_price_000 ~ square_feet + bedrooms + age + days_on_market, data = clean_x) summary(Final_Model)

Related Samples

Explore a plethora of exemplary statistics assignments showcasing diverse topics and methodologies. Delve into our curated collection to gain insights into various statistical concepts, analysis techniques, and problem-solving strategies. Each sample provides a valuable reference point for understanding complex statistical problems and refining your skills. Dive into our repository to enrich your understanding and excel in statistical analysis.