我正在尝试使用多元线性回归机器学习基于某些输入来评估输出。我已经训练了数据并在下面的代码下运行时获得了正确的期望值:
dataset = pd.read_excel('TEST.xlsx')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 5].values
# Encoding categorical data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder = LabelEncoder()
X[:, 0] = labelencoder.fit_transform(X[:, 0]) # 1ST COLUMN
labelencoder1 = LabelEncoder()
X[:, 1] = labelencoder1.fit_transform(X[:, 1]) # 2ND COLUMN
labelencoder2 = LabelEncoder()
X[:, 2] = labelencoder2.fit_transform(X[:, 2]) # # 3RD COLUMN
onehotencoder = OneHotEncoder(categorical_features = "all")
X = onehotencoder.fit_transform(X).toarray()
# Avoiding the Dummy Variable Trap
X = X[:, 1:]
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test) # TILL HERE ITS WORKING AS EXPECTED
现在,我尝试使用相同的模型来评估另一组输入数据,如下所示:
dataset1 = pd.read_excel('TEST1.xlsx') # NEW SET OF INPUT RECORDS TO BE EVALUATE
X1 = dataset1.iloc[:, :-1].values
# Encoding categorical data
labelencoder3 = LabelEncoder()
X1[:, 0] = labelencoder3.fit_transform(X1[:, 0])
labelencoder4 = LabelEncoder()
X1[:, 1] = labelencoder4.fit_transform(X1[:, 1])
labelencoder5 = LabelEncoder()
X1[:, 2] = labelencoder5.fit_transform(X1[:, 2])
onehotencoder2 = OneHotEncoder(categorical_features = "all")
X1 = onehotencoder2.fit_transform(X1).toarray()
X1 = X1[:, 1:]
output = regressor.predict(X1)
但是,当我运行此代码时,出现以下错误:
ValueError:形状(6,13)和(390,)未对齐:13(dim 1)!= 390(dim 0)
如果有人帮助我解决此问题,那将是很好的。
您给出的错误我不知道是哪一行抛出来的,而且代码片段也不是很完整。有一个泛泛的建议:下载一个Pycharm,把代码帖进去,找少量的数据先把代码调通,然后再大批量的跑。 一般来说形状未对齐的错误都是X和y的shape不一致或者预训练的model和实际数据shape不一致导致的,可以往这个方向想一想。
版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。