从零开始数据分析Kaggle项目——泰坦尼克号(五)

从零开始数据分析Kaggle项目—泰坦尼克号2—2.1


import pandas as pd
import numpy as np
df = pd.read_csv("train.csv")

df.isna().sum()
df.info()
<class 'pandas.core.frame.dataframe'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 12 columns):
 #   Column       Non-Null Count  Dtype
 0   PassengerId  891 non-null    int64
 1   Survived     891 non-null    int64
 2   Pclass       891 non-null    int64
 3   Name         891 non-null    object
 4   Sex          891 non-null    object
 5   Age          891 non-null    float64
 6   SibSp        891 non-null    int64
 7   Parch        891 non-null    int64
 8   Ticket       891 non-null    object
 9   Fare         891 non-null    float64
 10  Cabin        362 non-null    object
 11  Embarked     889 non-null    object
dtypes: float64(2), int64(5), object(5)
memory usage: 83.7+ KB
</class>
df.isna().sum()
PassengerId      0
Survived         0
Pclass           0
Name             0
Sex              0
Age              0
SibSp            0
Parch            0
Ticket           0
Fare             0
Cabin          529
Embarked         2
dtype: int64

df1 = df.dropna(subset=['Cabin', 'Embarked'])
df1.isna().sum()
df1.info()

`

Int64Index: 360 entries, 1 to 889
Data columns (total 12 columns):
# Column Non-Null Count Dtype

Original: https://blog.csdn.net/weixin_45058606/article/details/122003899
Author: 一个游在的小鱼
Title: 从零开始数据分析Kaggle项目——泰坦尼克号(五)

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/755791/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球