python的向量表示_python-dataframe生成表示向量的列

from sklearn.preprocessing import MultiLabelBinarizer

mlb = MultiLabelBinarizer()

df1 = pd.DataFrame(mlb.fit_transform(df[‘genres’]),columns=mlb.classes_, index=df.index)

df = df.join(df1)

print (df)

genres Action Adventure Comedy Drama Family \

0 [Drama] 0 0 0 1 0

1 [Music, Drama, Romance] 0 0 0 1 0

2 [Action, Adventure, Comedy] 1 1 1 0 0

3 [Thriller, Romance, Drama] 0 0 0 1 0

4 [Adventure, Family] 0 1 0 0 1

Music Romance Thriller

0 0 0 0

1 1 1 0

2 0 0 0

3 0 1 1

4 0 0 0

如果需要按列表筛选流派添加

reindex

genres = [‘Action’, ‘Adventure’, ‘Comedy’, ‘Drama’]

df1 = pd.DataFrame(mlb.fit_transform(df[‘genres’]),columns=mlb.classes_, index=df.index)

df = df.join(df1.reindex(columns=genres, fill_value=0))

print (df)

genres Action Adventure Comedy Drama

0 [Drama] 0 0 0 1

1 [Music, Drama, Romance] 0 0 0 1

2 [Action, Adventure, Comedy] 1 1 1 0

3 [Thriller, Romance, Drama] 0 0 0 1

4 [Adventure, Family] 0 1 0 0

Original: https://blog.csdn.net/weixin_33054847/article/details/112922651
Author: 狐狸君raphael
Title: python的向量表示_python-dataframe生成表示向量的列

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/677911/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球