Time-series data using Python
To do stock analysis, we need to be familiar with the formats in which time-series data receive. We will create time-series data using python’s popular packages, numpy, and pandas.
First, we need to import a few popular packages, pandas, numpy, and standard datatime package.
import numpy as np import pandas as pd import matplotlib.pyplot as plt from datetime import datetime
To create a list with date and time, we will use the datetime class, part of the standard python package.
# Input data c_year = 2021 c_month = 11 c_day = 10 c_hour = 1 c_min = 20 c_sec = 15 c_date_obj = datetime(c_year,c_month,c_day,c_hour,c_min,c_sec) print(c_date_obj) # 2021-11-10 01:20:15
If we do not pass the entire parameters, the python DateTime class considers zero and creates a DateTime object. To access the individual arguments from the DateTime class, we can do as follows:
# How to access individual parameters from the datetime object. print(c_date_obj.year) # 2021 print(c_date_obj.month) # 11 print(c_date_obj.day) # 10
To create a list of datetime objects which we will further supply to Pandas’s DataTimeIndex class.
# Example to create datetime list datetime_lst = [datetime(2021,11,1),datetime(2021,11,2),datetime(2021,11,3),datetime(2021,11,4)] print(datetime_lst) # [datetime.datetime(2021, 11, 1, 0, 0), datetime.datetime(2021, 11, 2, 0, 0), datetime.datetime(2021, 11, 3, 0, 0), datetime.datetime(2021, 11, 4, 0, 0)]
Pandas’s DateTimeIndex class attribute outputs an Index object containing the date values present in each of the entries of the DatetimeIndex object.
# Pandas datetime object datetime_pd_obj = pd.DatetimeIndex(datetime_lst) print(datetime_pd_obj) # DatetimeIndex(['2021-11-01', '2021-11-02', '2021-11-03', '2021-11-04'], dtype='datetime64[ns]', freq=None)
Stock data usually consist of DateTime and price. In this case, we will randomly generate the price of a stock, and for that, we will use the numpy random number generator class.
# To get associated data, we will randomly generate using numpy random function. s_data = np.random.randn(4,1) print(s_data) #[[ 1.16341894] # [ 0.89951112] # [-0.32026986] # [ 0.46115012]]
Further, we will associate random stock price data and DateTime data together using Pandas’s dataframe class. As an output, we will have a dataframe object.
s_hdr = ['stock_price'] s_df_obj = pd.DataFrame(s_data, datetime_pd_obj,s_hdr) print(s_df_obj) # stock_price #2021-11-01 1.163419 #2021-11-02 0.899511 #2021-11-03 -0.320270 #2021-11-04 0.461150
We can make the queries on the dataframe object. For example, to find the max and min index values, or to find the max or min index values.
# Query on the dataframe print(s_df_obj.index.argmax()) print(s_df_obj.index.argmin()) # To find the index values print(s_df_obj.index.max()) print(s_df_obj.index.min())
As a conclusion, we have seen, how to create timeseries data using Pandas, Numpy and DataTime packages in Python. The entire code requires to create manual timeseries data are as follows.
import numpy as np import pandas as pd import matplotlib.pyplot as plt from datetime import datetime # Playing with python datetime class. # Using datetime class, we can create datatime object by supplying appropriate arguments. # Input data c_year = 2021 c_month = 11 c_day = 10 c_hour = 1 c_min = 20 c_sec = 15 c_date_obj = datetime(c_year,c_month,c_day,c_hour,c_min,c_sec) print(c_date_obj) # 2021-11-10 01:20:15 # How to access individual parameters from the datetime object. print(c_date_obj.year) # 2021 print(c_date_obj.month) # 11 print(c_date_obj.day) # 10 # Example to create datetime list datetime_lst = [datetime(2021,11,1),datetime(2021,11,2),datetime(2021,11,3),datetime(2021,11,4)] print(datetime_lst) # [datetime.datetime(2021, 11, 1, 0, 0), datetime.datetime(2021, 11, 2, 0, 0), datetime.datetime(2021, 11, 3, 0, 0), datetime.datetime(2021, 11, 4, 0, 0)] # Pandas datetime object datetime_pd_obj = pd.DatetimeIndex(datetime_lst) print(datetime_pd_obj) # DatetimeIndex(['2021-11-01', '2021-11-02', '2021-11-03', '2021-11-04'], dtype='datetime64[ns]', freq=None) # To get associated data, we will randomly generate using numpy random function. s_data = np.random.randn(4,1) print(s_data) #[[ 1.16341894] # [ 0.89951112] # [-0.32026986] # [ 0.46115012]] s_hdr = ['stock_price'] s_df_obj = pd.DataFrame(s_data, datetime_pd_obj,s_hdr) print(s_df_obj) # stock_price #2021-11-01 1.163419 #2021-11-02 0.899511 #2021-11-03 -0.320270 #2021-11-04 0.461150 # Query on the dataframe print(s_df_obj.index.argmax()) print(s_df_obj.index.argmin()) # To find the index values print(s_df_obj.index.max()) print(s_df_obj.index.min())
References
- https://pandas.pydata.org/docs/reference/api/pandas.DatetimeIndex.html
- https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html
- Udemy – Python for Financial ANalysis and Algorithmic Trading
CITE THIS AS:
“How to create time-serier data in Python” From NotePub.io – Publish & Share Note! https://notepub.io/questions/how-to-create-time-series-data-in-python/
1,521 total views, 1 views today