Math Problem Statement

  1. 已知有pandas.DataFrame对象df,包括姓名列name、工资列salary、销售额列sales、和地址列address,其中sales列有部分值缺失,请使用线性插值法处理该部分缺失值。根据注释补充代码。 import pandas as pd

先把df存在空值的行删除

(1) ______________________________ ____________ ___________

导入线性插值依赖包

from scipy.interpolate import interp1d

将salary列转换为数组,是自变量

(2) # sales是因变量 y = np.array(df_dropna["sales"].tolist())

线性插值拟合x,y

LinearInsValue = interp1d(x, y, kind='linear')

针对sales值为空的记录找出其salary值

for i in range(len(df)):

如果i行的sales是na

(3)
# 找到对应的salary值 (4)
sales_value = LinearInsValue([s_value]) # 将对应位置的销售额sales设为推导出来的sales_value[0] (5)

Solution

Ask a new question for Free

By Image

Drop file here or Click Here to upload

Math Problem Analysis

Mathematical Concepts

Linear Interpolation
Data Handling with Pandas

Formulas

Linear interpolation formula

Theorems

-

Suitable Grade Level

Advanced