我有这张表:
user_id | datetime | type
1 | 2015-01-01 | 1
1 | 2015-01-01 | 2
1 | 2015-01-01 | 2
1 | 2015-01-02 | 2
2 | 2015-01-01 | 2
2 | 2015-01-02 | 1
2 | 2015-01-02 | 2
我有这个 pivot_table
代码:
df = df.pivot_table('type', ['user_id'], ['datetime'], aggfunc=np.mean)
但是,我想同时应用两个 unique().sum() 函数来满足此条件,而不是 np.mean
:
If there are both
1
and2
during specific days per user, then I want to put3
, if there is only1
for a specific day I want to put2
, etc.
例如这里是所需的输出如下:
user_id | 2015-01-01 | 2015-01-02
1 | 3 | 2
2 | 2 | 3
有什么想法吗?
请您参考如下方法:
这是你想要的吗?
In [50]: df.pivot_table('type', ['user_id'], ['datetime'], aggfunc=lambda x: x.unique().sum())
Out[50]:
datetime 2015-01-01 2015-01-02
user_id
1 3 2
2 2 3