pandas时间日期数据处理
将字符串转变为datetime类型pd.to_datetime
该函数的方法既可以是字符串,也可以是列表,也可以是series
pd.to_datetime('2018-10-26 12:00 -0500')
pd.to_datetime(['2018-10-26 12:00 -0500', '2018-10-26 13:00 -0500'])
df['WorkingDate'] = pd.to_datetime(df['WorkingDate'])
按指定要求生成时间series
date_series = pd.date_range(start='2024-5-14 8:20',end='2024-5-14 19:20',freq='10min')
freq别名见文末
按时间筛选
- 对于
DatetimeIndex
可以直接使用loc索引
- 非时间类型的索引(非
DatetimeIndex
)可以使用between
筛选时间
import pandas as pd
import numpy as np
date_series = pd.date_range(start='2024-5-14 8:20',end='2024-5-14 19:20',freq='10min')
df = pd.DataFrame(np.ones((67,2)),
index=date_series, columns=['A', 'B'])
# 时间类型的index
df_result = df.loc['2024-5-14 8:20':'2024-5-14 9:20']
print(df_result)
df2 = df.reset_index() # df2 非时间类型的index
filter1 = df2['index'].between('2024-5-14 8:20','2024-5-14 9:20')
filter1_df = df2.loc[filter1,['A','B']]
print(filter1_df)
数据按时间降采样 resample
当数据采样过于密集,统计需要按小时,按天,按月等聚合时可以使用resample
import pandas as pd
import numpy as np
date_series = pd.date_range(start='2024-5-14 8:20',end='2024-5-14 19:20',freq='10min')
df = pd.DataFrame(np.ones((67,2)),
index=date_series, columns=['A', 'B'])
#降采样
result1 = df['A'].resample('H').sum() # 按小时降采样
result2 = df['A'].resample('H').count()
print(result1)
print(result2)
日期格式转字符串
df['time'].apply(lambda x:x.strftime('%Y-%m-%d'))
常见freq
https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases
Alias Description
B business day frequency
C custom business day frequency
D calendar day frequency
W weekly frequency
M month end frequency
SM semi-month end frequency (15th and end of month)
BM business month end frequency
CBM custom business month end frequency
MS month start frequency
SMS semi-month start frequency (1st and 15th)
BMS business month start frequency
CBMS custom business month start frequency
Q quarter end frequency
BQ business quarter end frequency
QS quarter start frequency
BQS business quarter start frequency
A, Y year end frequency
BA, BY business year end frequency
AS, YS year start frequency
BAS, BYS business year start frequency
BH business hour frequency
H hourly frequency
T, min minutely frequency
S secondly frequency
L, ms milliseconds
U, us microseconds
N nanoseconds