Mike Conway, 04 Jan 2022
This notebook utilizes and extends code generated by Cole Citrenbaum
This notebook generates a scatterplot of Google Trends results for the terms:
"alcoholic" OR "alcoholic"
First, import relevant Python3 libraries:
from pytrends.request import TrendReq
import datetime as dt
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [30, 30]
%matplotlib inline
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
import seaborn as sns
Second, call pytrends library with relevant terms and generate pandas data frame. Search is restricted to the United States during the period Jan 2004 to Dec 2021
pytrends = TrendReq(hl='en-US', tz=360) # trend request
kw_list = ["\"alcoholic\" + \"alcoholics\""] # keyword list
pytrends.build_payload(kw_list, cat=0, timeframe='2004-01-01 2021-12-31', geo='US', gprop='') # build pytrends data
trendinfo = pd.DataFrame(pytrends.interest_over_time()) # interest over time, dataframe format
# SHOWS ALL ROWS
#pd.set_option('display.max_rows',trendinfo.shape[0]+1)
# SHOWS HEAD (first 5 rows in dataframe)
trendinfo.head()
"alcoholic" + "alcoholics" | isPartial | |
---|---|---|
date | ||
2004-01-01 | 89 | False |
2004-02-01 | 88 | False |
2004-03-01 | 95 | False |
2004-04-01 | 96 | False |
2004-05-01 | 79 | False |
Third, plotting
indices = trendinfo.index # in datetime format
x = np.array(trendinfo['"alcoholic" + "alcoholics"']).reshape((-1, 1)) # alcohol use array
fig, ax = plt.subplots()
ax.scatter(indices, x, color='g', s=3,label='\"alcoholic\" OR \"alcoholics\"')
plt.legend(bbox_to_anchor=(.39, 1), loc='upper left', borderaxespad=0.)
plt.ylabel('Search Interest')
plt.xlabel('Year')
plt.show()
fig.savefig("alcoholic.png", dpi=300)