Build a small project on Train and Run ML Model inside a Docker Container
Task description -
👉 Pull the Docker container image of CentOS image from DockerHub and create a new container
👉 Install the Python software on the top of docker container
👉 In Container you need to copy/create machine learning model which you have created in Jupyter notebook.
Requirements-
Any OS(here we are using RedHat Linux 8) on top docker installed.
Have dataset on which you want to work.
Step by step instructions-
step1- To find os name and version on Linux:
Type the following cat command: “ $ cat /etc/os-release”
it will tell you the os which you running , in my case it is Centos
step 2 — To check if Docker is installed or not on the top of OS.
Command-: docker info
Step3:- We have to start docker service by following command :
command- systemctl start docker
after this type another command to check the status of running docker service.
command:- systemctl status docker
Now docker services is started.
Step 4- Next we have to pull CentOS image from DockerHub to our Red Hat Enterprise Linux 8 System. Type command to pull image.
command- docker pull <image_name>:<version_name>
Step 5- To check docker image is downloaded or not type command.
command:- docker images
We can see image is downloaded.
Step 6- Now we have to create container using this image.
command:- docker run -it centos : latest
new conatiner running with a image name centos latest version.
Step 7- Now we will install python3 on top of container we created just now.
command- yum install python3-pip
After successfully installation of python3 we need to download different libraries so that we can work on our dataset.
step 7- Install pandas library so that we can read or load the dataset that is created in excel file having .csv extension.
Command-: pip3 install pandas
In this numpy library also installed with it.
Step 8- Now install scikit-learn library
This library provides many useful functions and algorithms for creating ML models.
After installation of required libraries we are done with base environment.
Now it’s time to work on dataset
Step 9- Now we have to install git to download dataset form github.
command- yum install git
Pre- requisite -
First upload your dataset on Github.
Step 10- Now we have to download that dataset inside a container by using command- git clone
Github repo link to download the dataset:-
command to download dataset- git clone <url of your dataset>
Now dataset is downloaded inside a container. To check downloaded file
command- ls
Step 11- Now we will create a python script file named as salaryml.py by using vi editor. So that we can train our model and run it.
command- vi salaryml.py
salaryml.py file will open . By pressing “i” from keyboard you can writ code in this file-
Code-
import pandas as pd
dataset=pd.read_csv(“SalaryData.csv”)
print(“Dataset has been loaded..”)
x=dataset feature and target variable
y=dataset[“Salary”]
from sklearn.linear_model import LinearRegression
model=LinearRegression()
model.fir(x,y)
print(“Model has been created”)
import joblib
joblib.dump(model, “Salarypredicter-model.pk1”)
print(“Now our model has been saved”)
Step 12- Now run this file to check whether it is working or not. To run this file using command python3 salaryml.py
If code runs successfully we have to create one more python script to predict the salary for some years of experience.
command- vi salarypredict.py
Code-
import joblib
pred_model=joblib.load(“Salarypredicter-model.pk1”)
p=int(input(“Enter years of experience-”)
sal=pred_model.predict([[p]])
print(“Estimated Salary is-” , round(sal[0],2), “INR.”)
Step 13- Lastly we have to run this file to check it is running of not.
command- python salarypredict.py
Bravo! Code predict salary successfully.
and yeah!
We are done!!!
Thank you!