New York City Airbnb Market Analysis and Price Prediction Analyzing the dynamic NYC Airbnb market: Predicting prices, exploring geographical trends, and understanding key factors. Utilized predictive modeling and interactive dashboards for comprehensive insights. Team project by @Nihith Nath Kandikattu, @Keerthana Allam, @Hithesh Kumar Duttuluri, and Charan Sai Pandaraboyina.
In this Project, we used the Airbnb NYC dataset from Kaggle. The main objectives were to handle data precisely and adhere to project guidelines. We used Python to parse the raw data, organized it into a well-structured database, and harnessed SQL to merge it for analysis using Pandas. Once these steps were completed, we had the freedom to choose our next path. We explored advanced data analysis with interactive elements and experimented with machine learning for predictions. Our projectβs success relied on executing each step meticulously, sharing our findings, and presenting a concise report.
In the ever-changing landscape of the New York City Airbnb market, our project aims to analyze data and predict prices, offering valuable insights for potential investors and discerning customers. Our main goal is to discover patterns that reveal areas with the highest number of listings, understand the factors influencing different costs, and grasp the preferences of both hosts and guests. By exploring the complex interactions between neighborhood characteristics, seasonal demand, and pricing dynamics, our research aims to equip new investors with decision-making tools and provide customers with a strategic advantage in selecting listings based on their preferences and budget constraints. This project provides a comprehensive understanding for hosts and guests, offering a valuable resource for strategic decision-making in the dynamic and popular Airbnb market.
| Column Name | Description |
|---|---|
| listing_name | The name of the Airbnb listing. (String) |
| host_name | The name of the host of the Airbnb listing. (String) |
| neighbourhood_grp | The neighbourhood group the Airbnb listing is located in. (String) |
| latitude | The latitude coordinate of the Airbnb listing. (Float) |
| longitude | The longitude coordinate of the Airbnb listing. (Float) |
| room_type | The type of room offered by the Airbnb listing. (String) |
| price | The price per night of the Airbnb listing. (Integer) |
| minimum_nights | The minimum number of nights required for booking the Airbnb listing. (Integer) |
| number_of_reviews | The total number of reviews the Airbnb listing has received. (Integer) |
| last_review | The date of the last review the Airbnb listing has received. (Date) |
| reviews_per_month | The average number of reviews the Airbnb listing receives per month. (Float) |
| calculated_host_listings_count | The total number of listings the host has. (Integer) |
| availability_365 | The number of days the Airbnb listing is available for booking in a year. (Integer) |
| NeighborhoodID | Neighborhood information for each listing (Referencing HostID from Host table) |
we have Normalized the raw data into 3 tables, Host, Neighborhood, Listings to remove transitive dependencies improving Data Integrity and to prevent Insertion,Updation and Deletion Anamolies
| Column Name | Description | |βββββββ|ββββββββββββββββ| | HostID (PK) | Unique ID for each host | | HostName | Hostβs Name | | NumberOfListings | Number of Listings under that host |
| Column Name | Description | |βββββββ|ββββββββββββββββ| | NeighborhoodID (PK) | Unique ID for each Neighborhood | | NeighborhoodGroup | Each Neighborhoodβs Area Location | | Neighborhood | Neighborhoodβs Name |
| Column Name | Description | |βββββββ-|ββββββββββββββββββββ| | ListingID (PK) | Unique ID for each Listing | | ListingName | Name of each listing | | HostID (FK) | Host information of each listing (Referencing HostID from Host table) | | NeighborhoodID (FK) | Neighborhood information for each listing (Referencing HostID from Host table) | | Latitude | Latitude information of the listing | | Longitude | Longitude information of the listing | | ListingType | Type of the listing (Entire Home/Apartment or Single Room) | | Price | Price per night of the listing | | MinimumNights | The minimum number of nights required for booking the listing | | NumberOfReviews | The total number of reviews the listing has received | | MonthlyReviewRate | The average number of reviews the listing receives per month| | Availability_365 | The number of days the listing is available for booking in a year |

During the Exploratory Data Analysis (EDA) phase, we visualized various aspects of the Airbnb dataset to gain insights into the distribution of listings, availability, and location. The following graphs provide a comprehensive analysis of the data:






It illustrates the concentration and pricing variations of listings across different neighborhoods in New York City. The blue dots highlight locations of key attractions, offering a spatial overview within the diverse NYC landscape.

To mitigate the impact of right-skewness in the original distribution of Airbnb listing prices, a logarithmic transformation was implemented during the analysis. This transformation aimed to create a more symmetrical distribution, thus improving the overall reliability of the dataset for subsequent analytical and modeling purposes. Moving forward, predictive modeling was executed using both Linear Regression and Ridge Regression models to forecast listing prices. Rigorous model evaluation ensued, employing key metrics such as Mean Absolute Error, Mean Squared Error, and R-squared. The study also included visualizations comparing actual and predicted prices, providing a comprehensive assessment of the modelsβ accuracy in capturing variations in listing prices.
| | |
| ββββββββββ | ββββββββββββββββββ |
|
|
|

In conclusion, our project conducted an in-depth analysis of the dynamic New York City Airbnb market, revealing significant insights. We meticulously explored room availability, categorized neighborhoods, and pinpointed locations near key tourist attractions, leading to the following key findings and insights.
Here is the outcome of the notebook in HTML: AirBNB Listing Analysis - pls download the file π₯
These insights provide a comprehensive understanding of the New York City Airbnb market, enabling better decision-making for both hosts and guests in the dynamic and competitive environment. ποΈπ‘π