本帖最后由 lvcaolhx 于 2021-3-31 16:00 编辑
情景应用:按指定顺序班级进行各班参考人数统计
表1:存放考试安排表;表2:存放指定顺序的所有班级目的:自动计算表2中指定顺序的所有班级参考人数
[Python] 纯文本查看 复制代码 import pandas as pd
df1=pd.DataFrame({'id':[429,429,432,431,429,432,431,430,431,432],'name':['Andy1','Jacky1','Bruce1','rose','Andy1','Jacky1','Bruce1','rose','Andy1','Jacky1'],'target':[1,0,1,0,1,0,1,0,1,1]})
df2=pd.DataFrame({'id':[429,430,432],'name':['Andy2','Jacky2','rose2']})
B_song = df1.groupby(['id','target'])['target'].agg('count').rename().reset_index() #按id,target分组统计,然后将多层索引转成普通索引
print('\n',B_song)
#print('\n',B_song.columns)
B_song.rename(columns={0:'counts'},inplace=True)#将统计出来的数据所在列更改列名
B_song=B_song[B_song['target']==1]#筛选出target列为1的行
print('\n',B_song)
s = B_song.set_index('id')['counts'] #取出需要的两列id ,counts将其转成Series
#df1=pd.DataFrame(s).reset_index()
#df1=df1.set_index('id')
print('\n',s)
df2=df2.set_index('id') # 将df2的id列转成行索引,目的是与s的index相同,为下面的update作准备
df2['counts']= None #在df2中增加一列,列名为counts
df2['counts'].update(s) #将s中的数据update到相应位置
print('\n',df2)
代码结果:
[Python] 纯文本查看 复制代码 PS D:\pcharm> & C:/Users/jiaoshi01/AppData/Local/Programs/Python/Python36-32/python.exe d:/pcharm/test002.py
id target 0
0 429 0 1
1 429 1 2
2 430 0 1
3 431 0 1
4 431 1 2
5 432 0 1
6 432 1 2
id target counts
1 429 1 2
4 431 1 2
6 432 1 2
id
429 2
431 2
432 2
Name: counts, dtype: int64
name counts
id
429 Andy2 2
430 Jacky2 None
432 rose2 2
PS D:\pcharm> |