小数据集(150行)[”.join(i) for i in zip(df[“Year”].map(str),df[“quarter”])]
或稍慢但更紧凑:df.Year.str.cat(df.quarter)
较大的数据集(150行)df[‘Year’].astype(str) + df[‘quarter’]
更新:计时图Pandas 0.23.4
让我们在200K行DF上测试一下:In [250]: df
Out[250]:
Year quarter
0 2014 q1
1 2015 q2
In [251]: df = pd.concat([df] * 10**5)
In [252]: df.shape
Out[252]: (200000, 2)
更新:使用Pandas 0.19.0的新计时
计时无CPU/GPU优化(从最快到最慢排序):In [107]: %timeit df[‘Year’].astype(str) + df[‘quarter’]
10 loops, best of 3: 131 ms per loop
In [106]: %timeit df[‘Year’].map(str) + df[‘quarter’]
10 loops, best of 3: 161 ms per loop
In [108]: %timeit df.Year.str.cat(df.quarter)
10 loops, best of 3: 189 ms per loop
In [109]: %timeit df.loc[:, [‘Year’,’quarter’]].astype(str).sum(axis=1)
1 loop, best of 3: 567 ms per loop
In [110]: %timeit df[[‘Year’,’quarter’]].astype(str).sum(axis=1)
1 loop, best of 3: 584 ms per loop
In [111]: %timeit df[[‘Year’,’quarter’]].apply(lambda x : ‘{}{}’.format(x[0],x[1]), axis=1)
1 loop, best of 3: 24.7 s per loop
计时使用CPU/GPU优化:In [113]: %timeit df[‘Year’].astype(str) + df[‘quarter’]
10 loops, best of 3: 53.3 ms per loop
In [114]: %timeit df[‘Year’].map(str) + df[‘quarter’]
10 loops, best of 3: 65.5 ms per loop
In [115]: %timeit df.Year.str.cat(df.quarter)
10 loops, best of 3: 79.9 ms per loop
In [116]: %timeit df.loc[:, [‘Year’,’quarter’]].astype(str).sum(axis=1)
1 loop, best of 3: 230 ms per loop
In [117]: %timeit df[[‘Year’,’quarter’]].astype(str).sum(axis=1)
1 loop, best of 3: 230 ms per loop
In [118]: %timeit df[[‘Year’,’quarter’]].apply(lambda x : ‘{}{}’.format(x[0],x[1]), axis=1)
1 loop, best of 3: 9.38 s per loop
回答@anton vbr的贡献
Original: https://blog.csdn.net/weixin_42370927/article/details/114360180
Author: 打盹儿的番茄
Title: python dataframe两列相乘_在pandas/python的dataframe中组合两列文本
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/742401/
转载文章受原作者版权保护。转载请注明原作者出处!