求pandas移动最大值对应的index
df = pd.DataFrame({"time": ['8:01', '8:03', '8:04', '8:05', '8:06', '8:07', '8:08', '8:09', '8:10'],
"weight": },
)
df = df.set_index("time")
df['roll_max'] = df['weight'].rolling(window=3, min_periods=1).max()# 移动3个数据最大值
df['roll_max_index'] = ''# 需要得出移动最大值对应的index
print(df)
‘time'为index,
’roll_max'为3个数移动时最大的weight数值,
我要得到移动时最大的weight数值对应的index,不知道怎么算?
请大神帮助!
https://attach.52pojie.cn//forum/202305/03/143013temkuqykeiuniny8.png?l 本帖最后由 hrh123 于 2023-5-4 20:39 编辑
lizy169 发表于 2023-5-4 08:59
蟹蟹大神,我试了一下,这种方法如果后面再出现一个数据与前面相同的数据,就会出现错误,求大神继续帮忙 ...
代码中使用了一个字典来映射重量和时间,而字典的键必须是唯一的.如果有相同的重量出现在不同的时间,那么字典就会覆盖之前的值导致索引不正确.你可以尝试用numpy或pandas中的rolling方法来检查窗口中重复的值,例如
import pandas as pd
time = ['8:01', '8:03', '8:04', '8:05', '8:06', '8:07', '8:08', '8:09', '8:10', '8:11']
weight =
df = pd.DataFrame({
"time": time,
"weight": weight},
)
df = df.set_index("time")
df['roll_max'] = df['weight'].rolling(window=3, min_periods=1).max().astype(int)
def last_is_duplicate(a):
if len(a) > 1:
return a[-1] in a[:-1]
else:
return False
dup = df['weight'].rolling('10s').apply(last_is_duplicate).astype('bool')
df = df[~dup]
print(df)
或者
import pandas as pd
import numpy as np
window = np.lib.stride_tricks.sliding_window_view(df['weight'], 3)
dup = np.apply_along_axis(last_is_duplicate, 1, window)
df = df[~dup]
import pandas as pd
time=['8:01', '8:03', '8:04', '8:05', '8:06', '8:07', '8:08', '8:09', '8:10']
weight=
datamap=dict(zip(weight,time))
df = pd.DataFrame({
"time": time,
"weight": weight},
)
df = df.set_index("time")
df['roll_max'] = df['weight'].rolling(window=3, min_periods=1).max().astype(int)
df['roll_max_index']=df['roll_max'].apply(lambda x:datamap)
print(df) 本帖最后由 hrh123 于 2023-5-3 18:34 编辑
df['roll_max_index'] = df.index.rolling(window=3, min_periods=1).apply(np.argmax).astype(int)+np.arange(len(df)-2)]
或者df['roll_max_index'] = df.rolling(3).apply(lambda x: x.idxmax())
亦或是maxidx = (df['weight'].values.size-3+1)[:,None] + np.arange(3)]).argmax(1)
df['roll_max_index'] = df.index
本帖最后由 lizy169 于 2023-5-4 09:01 编辑
蟹蟹大神,我试了一下,这种方法如果后面再出现一个数据与前面相同的数据,就会出现错误,求大神继续帮忙指点;
import pandas as pd
time=['8:01', '8:03', '8:04', '8:05', '8:06', '8:07', '8:08', '8:09', '8:10', '8:11']
weight=
datamap=dict(zip(weight,time))
df = pd.DataFrame({
"time": time,
"weight": weight},
)
df = df.set_index("time")
df['roll_max'] = df['weight'].rolling(window=3, min_periods=1).max().astype(int)
df['roll_max_index']=df['roll_max'].apply(lambda x:datamap)
print(df)
weightroll_max roll_max_index
time
8:01 20 20 8:01
8:03 19 20 8:01
8:04 28 28 8:11
8:05 27 28 8:11
8:06 24 28 8:11
8:07 51 51 8:07
8:08 23 51 8:07
8:09 33 51 8:07
8:10 37 37 8:10
8:11 28 37 8:10 后面我用的这种方式得到的索引,没有time列做索引
import pandas as pd
time=['8:01', '8:03', '8:04', '8:05', '8:06', '8:07', '8:08', '8:09', '8:10', '8:11']
weight=
df = pd.DataFrame({
"time": time,
"weight": weight},
)
df['roll_max'] = df['weight'].rolling(window=3, min_periods=1).max().astype(int)
df['move_max_idx'] = df['weight'].rolling(window=3, min_periods=1).apply(lambda x: x.idxmax()).astype(int)
print(df) 学习一下
页:
[1]