Data Science Asked by Biohacker on September 1, 2021
I am trying to determine the conformal predictions for my model with my data. But it gives me following error that occurs at icp.calibrate(X_cal, y_cal) :
Exception: Data must be 1-dimensional
Below you can find the most recent traceback error about this. Unfortunately I am not sure on what this actually infers based on the code from above. I am using a pandas dataframe for this.
#Code Snippet
from sklearn.tree import DecisionTreeRegressor
from nonconformist.cp import IcpRegressor
from nonconformist.base import RegressorAdapter
from nonconformist.nc import RegressorNc, AbsErrorErrFunc, RegressorNormalizer, NcFactory
from sklearn.model_selection import train_test_split
import numpy as np
import pandas as pd
# -----------------------------------------------------------------------------
# Setup training, calibration and test data
# -----------------------------------------------------------------------------
df = pd.read_csv ("prepared_data.csv")
# Initial split into train/test data
train = df.loc[df['split']== 'train']
valid = df.loc[df['split']== 'valid']
# Proper Validation Set (Split the Validation set into features and target)
X_valid = valid.drop(['expression'], axis = 1)
y_valid = valid.drop(columns = ['new_host', 'split', 'sequence'])
# Create Training Set (Split the Training set into features and target)
X_train = valid.drop(['expression'], axis = 1)
y_train = valid.drop(columns = ['new_host', 'split', 'sequence'])
# Split Training set into further training set and calibration set
X_train, X_cal, y_train, y_cal = train_test_split(X_train, y_train, test_size =0.2)
# -----------------------------------------------------------------------------
# Train and calibrate underlying model
# -----------------------------------------------------------------------------
underlying_model = RegressorAdapter(DecisionTreeRegressor(min_samples_leaf=5))
print("Underlying model loaded")
model = RegressorAdapter(underlying_model)
nc = RegressorNc(model, AbsErrorErrFunc())
print("Nonconformity Function Applied")
icp = IcpRegressor(nc) # Create an inductive conformal Regressor
print("ICP Regressor Created")
#Dataset Review
print('{} instances, {} features, {} classes'.format(y_train.size,
X_train.shape[1],
np.unique(y_train).size))
icp.fit(X_train, y_train)
icp.calibrate(X_cal, y_cal)
#Example Dataframe
new_host split sequence expression
FALSE train AQVPYGVS 0.039267878
FALSE train ASVPYGVSI 0.039267878
FALSE train STNLYGSGR 0.261456561
FALSE valid NLYGSGLVR 0.265188519
FALSE valid SLGPSNLYG 0.419680588
FALSE valid ATSLGTTNG 0.145710993
I’ve tried splitting the dataset in various ways but I am continuing to have trouble with this. In this case I want to split the data into train and test sets according to an observation’s Data Split value. After which, I will split the train set into train and calibration in a second step. Where myfeatures, X_train and my target, y_train
It seems to me that this question is better off on stackoverflow.
Nevertheless, X_cal
gets generated from X_train
and X_train
from valid
. But this is an atleast 2-dimensional dataframe with new_host
and sequence
. Like the error says you should only input data that is 1-dimensional.
Answered by N. Kiefer on September 1, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP