我认为你需要
numpy.select广播:
m1 = df.index.values[:, None] > df.columns.values
m2 = df.index.values[:, None] == df.columns.values
df = pd.DataFrame(np.select([m1, m2], [‘k’,’U’], ‘Y’), columns=df.columns, index=df.index)
print (df)
2 4 8
10 k k k
4 k U Y
2 U Y Y
性能:
np.random.seed(1000)
N = 1000
a = np.random.randint(100, size=N)
b = np.random.randint(100, size=N)
df = pd.DataFrame(np.random.choice(list(‘abcdefgh’), size=(N, N)), columns=a, index=b)
print (df)
def us(df):
values = np.array(np.array([df.index]).transpose() – np.array([df.columns]), dtype=’object’)
greater = values > 0
less = values < 0
same = values == 0
values[greater] = ‘k’
values[less] = ‘Y’
values[same] = ‘U’
return pd.DataFrame(values, columns=df.columns, index=df.index)
def jez(df):
m1 = df.index.values[:, None] > df.columns.values
m2 = df.index.values[:, None] == df.columns.values
return pd.DataFrame(np.select([m1, m2], [‘k’,’U’], ‘Y’), columns=df.columns, index=df.index)
In [236]: %timeit us(df)
107 ms ± 358 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [237]: %timeit jez(df)
64 ms ± 299 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Original: https://blog.csdn.net/weixin_42318225/article/details/114354728
Author: bellebiself
Title: pandas 根据列名索引多列数据_python – Pandas DataFrame根据列,索引值比较更改值
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/743418/
转载文章受原作者版权保护。转载请注明原作者出处!