Goal: Model how booking demand responds to changes in price, and estimate optimal pricing zones.
Code
# 1️⃣ Load Dataimport pandas as pddf = pd.read_csv('../data/raw/hotel_bookings.csv')# Filter to completed bookings only (exclude cancellations)df = df[df['is_canceled'] ==0]df.head()
hotel
is_canceled
lead_time
arrival_date_year
arrival_date_month
arrival_date_week_number
arrival_date_day_of_month
stays_in_weekend_nights
stays_in_week_nights
adults
...
deposit_type
agent
company
days_in_waiting_list
customer_type
adr
required_car_parking_spaces
total_of_special_requests
reservation_status
reservation_status_date
0
Resort Hotel
0
342
2015
July
27
1
0
0
2
...
No Deposit
NaN
NaN
0
Transient
0.0
0
0
Check-Out
2015-07-01
1
Resort Hotel
0
737
2015
July
27
1
0
0
2
...
No Deposit
NaN
NaN
0
Transient
0.0
0
0
Check-Out
2015-07-01
2
Resort Hotel
0
7
2015
July
27
1
0
1
1
...
No Deposit
NaN
NaN
0
Transient
75.0
0
0
Check-Out
2015-07-02
3
Resort Hotel
0
13
2015
July
27
1
0
1
1
...
No Deposit
304.0
NaN
0
Transient
75.0
0
0
Check-Out
2015-07-02
4
Resort Hotel
0
14
2015
July
27
1
0
2
2
...
No Deposit
240.0
NaN
0
Transient
98.0
0
1
Check-Out
2015-07-03
5 rows × 32 columns
Code
# 2️⃣ Aggregate Price & Demand# Group by price buckets → example simple binningdf['price_bucket'] = pd.cut(df['adr'], bins=[0, 50, 100, 150, 200, 300, 500, 1000])# Count bookings per price bucketdemand_by_price = df.groupby('price_bucket').size().reset_index(name='bookings')demand_by_price
C:\Users\atima\AppData\Local\Temp\ipykernel_4156\1096299440.py:7: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
demand_by_price = df.groupby('price_bucket').size().reset_index(name='bookings')
price_bucket
bookings
0
(0, 50]
7706
1
(50, 100]
33972
2
(100, 150]
21520
3
(150, 200]
7184
4
(200, 300]
2848
5
(300, 500]
187
6
(500, 1000]
2
Code
# 3️⃣ Plot Demand vs Priceimport matplotlib.pyplot as plt# Midpoint of each bucket → simple estimatedemand_by_price['price_mid'] = demand_by_price['price_bucket'].apply(lambda x: x.mid)plt.figure(figsize=(8,6))plt.plot(demand_by_price['price_mid'], demand_by_price['bookings'], marker='o')plt.title('Bookings vs Price (Elasticity Curve)')plt.xlabel('Price (Midpoint of Bucket)')plt.ylabel('Number of Bookings')plt.grid()plt.show()
Code
# 4️⃣ Log-Log Regression (Elasticity Estimate)import numpy as npimport statsmodels.api as sm# Prepare log-log datademand_by_price['price_mid'] = demand_by_price['price_mid'].astype(float)X = np.log(demand_by_price['price_mid'])y = np.log(demand_by_price['bookings'])X = sm.add_constant(X) # add interceptmodel = sm.OLS(y, X).fit()print(model.summary())# Elasticity ≈ slope coefficient → % change in demand per % change in price
OLS Regression Results
==============================================================================
Dep. Variable: bookings R-squared: 0.598
Model: OLS Adj. R-squared: 0.518
Method: Least Squares F-statistic: 7.441
Date: Mon, 09 Jun 2025 Prob (F-statistic): 0.0414
Time: 01:51:40 Log-Likelihood: -14.821
No. Observations: 7 AIC: 33.64
Df Residuals: 5 BIC: 33.53
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 19.5084 4.513 4.323 0.008 7.909 31.108
price_mid -2.3678 0.868 -2.728 0.041 -4.599 -0.137
==============================================================================
Omnibus: nan Durbin-Watson: 1.033
Prob(Omnibus): nan Jarque-Bera (JB): 1.053
Skew: -0.709 Prob(JB): 0.591
Kurtosis: 1.734 Cond. No. 27.0
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
c:\GitHub\DynamicHotelPricingOptimization\venv\Lib\site-packages\statsmodels\stats\stattools.py:74: ValueWarning: omni_normtest is not valid with less than 8 observations; 7 samples were given.
warn("omni_normtest is not valid with less than 8 observations; %i "