How to create time-series data in python?

Time-series data using Python

To do stock analysis, we need to be familiar with the formats in which time-series data receive. We will create time-series data using python’s popular packages, numpy, and pandas.

First, we need to import a few popular packages, pandas, numpy, and standard datatime package.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime

To create a list with date and time, we will use the datetime class, part of the standard python package.

# Input data
c_year = 2021
c_month = 11
c_day = 10
c_hour = 1
c_min = 20
c_sec = 15

c_date_obj = datetime(c_year,c_month,c_day,c_hour,c_min,c_sec)
print(c_date_obj)
# 2021-11-10 01:20:15

If we do not pass the entire parameters, the python DateTime class considers zero and creates a DateTime object. To access the individual arguments from the DateTime class, we can do as follows:

# How to access individual parameters from the datetime object. 
print(c_date_obj.year)
# 2021

print(c_date_obj.month)
# 11

print(c_date_obj.day)
# 10

To create a list of datetime objects which we will further supply to Pandas’s DataTimeIndex class.

# Example to create datetime list
datetime_lst = [datetime(2021,11,1),datetime(2021,11,2),datetime(2021,11,3),datetime(2021,11,4)]
print(datetime_lst)
# [datetime.datetime(2021, 11, 1, 0, 0), datetime.datetime(2021, 11, 2, 0, 0), datetime.datetime(2021, 11, 3, 0, 0), datetime.datetime(2021, 11, 4, 0, 0)]

Pandas’s DateTimeIndex class attribute outputs an Index object containing the date values present in each of the entries of the DatetimeIndex object.

# Pandas datetime object
datetime_pd_obj = pd.DatetimeIndex(datetime_lst)
print(datetime_pd_obj)
# DatetimeIndex(['2021-11-01', '2021-11-02', '2021-11-03', '2021-11-04'], dtype='datetime64[ns]', freq=None)

Stock data usually consist of DateTime and price. In this case, we will randomly generate the price of a stock, and for that, we will use the numpy random number generator class

# To get associated data, we will randomly generate using numpy random function.
s_data = np.random.randn(4,1)
print(s_data)
#[[ 1.16341894]
# [ 0.89951112]
# [-0.32026986]
# [ 0.46115012]]

Further, we will associate random stock price data and DateTime data together using Pandas’s dataframe class. As an output, we will have a dataframe object.

s_hdr = ['stock_price']
s_df_obj = pd.DataFrame(s_data, datetime_pd_obj,s_hdr)
print(s_df_obj)

#                 stock_price
#2021-11-01       1.163419
#2021-11-02       0.899511
#2021-11-03      -0.320270
#2021-11-04       0.461150

We can make the queries on the dataframe object. For example, to find the max and min index values, or to find the max or min index values.

# Query on the dataframe
print(s_df_obj.index.argmax())
print(s_df_obj.index.argmin())

# To find the index values
print(s_df_obj.index.max())
print(s_df_obj.index.min())

As a conclusion, we have seen, how to create timeseries data using Pandas, Numpy and DataTime packages in Python. The entire code requires to create manual timeseries data are as follows.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime


# Playing with python datetime class.
# Using datetime class, we can create datatime object by supplying appropriate arguments.

# Input data
c_year = 2021
c_month = 11
c_day = 10
c_hour = 1
c_min = 20
c_sec = 15

c_date_obj = datetime(c_year,c_month,c_day,c_hour,c_min,c_sec)
print(c_date_obj)
# 2021-11-10 01:20:15

# How to access individual parameters from the datetime object. 
print(c_date_obj.year)
# 2021

print(c_date_obj.month)
# 11

print(c_date_obj.day)
# 10

# Example to create datetime list
datetime_lst = [datetime(2021,11,1),datetime(2021,11,2),datetime(2021,11,3),datetime(2021,11,4)]
print(datetime_lst)
# [datetime.datetime(2021, 11, 1, 0, 0), datetime.datetime(2021, 11, 2, 0, 0), datetime.datetime(2021, 11, 3, 0, 0), datetime.datetime(2021, 11, 4, 0, 0)]

# Pandas datetime object
datetime_pd_obj = pd.DatetimeIndex(datetime_lst)
print(datetime_pd_obj)
# DatetimeIndex(['2021-11-01', '2021-11-02', '2021-11-03', '2021-11-04'], dtype='datetime64[ns]', freq=None)

# To get associated data, we will randomly generate using numpy random function.
s_data = np.random.randn(4,1)
print(s_data)

#[[ 1.16341894]
# [ 0.89951112]
# [-0.32026986]
# [ 0.46115012]]

s_hdr = ['stock_price']
s_df_obj = pd.DataFrame(s_data, datetime_pd_obj,s_hdr)
print(s_df_obj)

#                 stock_price
#2021-11-01       1.163419
#2021-11-02       0.899511
#2021-11-03      -0.320270
#2021-11-04       0.461150

# Query on the dataframe
print(s_df_obj.index.argmax())
print(s_df_obj.index.argmin())

# To find the index values
print(s_df_obj.index.max())
print(s_df_obj.index.min())

References


CITE THIS AS:
“How to create time-serier data in Python” From NotePub.io – Publish & Share Note! https://notepub.io/questions/how-to-create-time-series-data-in-python/

 1,520 total views,  1 views today

Scroll to Top
Scroll to Top