从零开始数据分析Kaggle项目—泰坦尼克号2—2.1
import pandas as pd
import numpy as np
df = pd.read_csv("train.csv")
df.isna().sum()
df.info()
<class 'pandas.core.frame.dataframe'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 12 columns):
# Column Non-Null Count Dtype
0 PassengerId 891 non-null int64
1 Survived 891 non-null int64
2 Pclass 891 non-null int64
3 Name 891 non-null object
4 Sex 891 non-null object
5 Age 891 non-null float64
6 SibSp 891 non-null int64
7 Parch 891 non-null int64
8 Ticket 891 non-null object
9 Fare 891 non-null float64
10 Cabin 362 non-null object
11 Embarked 889 non-null object
dtypes: float64(2), int64(5), object(5)
memory usage: 83.7+ KB
</class>
df.isna().sum()
PassengerId 0
Survived 0
Pclass 0
Name 0
Sex 0
Age 0
SibSp 0
Parch 0
Ticket 0
Fare 0
Cabin 529
Embarked 2
dtype: int64
df1 = df.dropna(subset=['Cabin', 'Embarked'])
df1.isna().sum()
df1.info()
`
Int64Index: 360 entries, 1 to 889
Data columns (total 12 columns):
# Column Non-Null Count Dtype
Original: https://blog.csdn.net/weixin_45058606/article/details/122003899
Author: 一个游在的小鱼
Title: 从零开始数据分析Kaggle项目——泰坦尼克号(五)
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/755791/
转载文章受原作者版权保护。转载请注明原作者出处!