How to Save Random Forest Model to File in Python
When you fit a random forest model in Python, it is essential to save the fitted model for future use for predicting the new dataset.
If you save the random forest model (or any other machine learning model) to a file, it will save your time for future use, especially when the model takes significant time or resources to train.
In Python, you can use the dump
function from pickle and joblib packages to save the random forest model to file.
Method 1: Using pickle
# import package
import pickle
# save Model
pickle.dump(model, open("rf_model.pkl", "wb"))
This will save the model as a binary file. You can use any file extension to save the model.
Method 2: Using joblib
# import package
import joblib
# save Model
joblib.dump(model, "rf_model.joblib")
The following examples explain in detail how to save the fitted random forest model to a file in Python.
Example 1: Using pickle
Let’s first develop a random forest model.
We will use the iris
dataset from sklearn
to fit the random Forest model.
Load and split the dataset,
# import package
from sklearn.datasets import load_iris
# load dataset
data = load_iris()
# split into training (70%) and testing (30%)
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.3)
# Fit random forest model
model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)
# prediction on test dataset
y_pred = model.predict(X_test)
Now, you can save this fitted random forest model (model
) to a file using the dump
function from pickle.
# import package
import pickle
# save Model
pickle.dump(model, open("rf_model.pkl", "wb"))
We have saved the fitted random forest model to rf_model.pkl
file.
if you want to load this model for future use, you can use the load
function from the pickle.
Load the model,
# load Model
model = pickle.load(open("rf_model.pkl", "rb"))
Example 2: Using joblib
Let’s first develop a random forest model.
We will use the iris
dataset from sklearn
to fit the random Forest model.
Load and split the dataset,
# import package
from sklearn.datasets import load_iris
# load dataset
data = load_iris()
# split into training (70%) and testing (30%)
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.3)
# Fit random forest model
model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)
# prediction on test dataset
y_pred = model.predict(X_test)
Now, you can save this fitted random forest model (model
) to a file using the dump
function from joblib.
# import package
import joblib
# save Model
joblib.dump(model, "rf_model.joblib"))
We have saved the fitted random forest model to rf_model.joblib
file.
if you want to load this model for future use, you can use the load
function from the pickle.
Load the model,
# load Model
model = joblib.load("rf_model.joblib")