Supervised vs Unsupervised Learning

2023-12-21 455 words 3 minutes

Contents

Machine learning (ML) is a subfield of artificial intelligence (AI) that focuses on the development of computer models that allow computers (machines) to learn from existing data and to perform tasks on new data without explicit programming.

Machine learning can be divided as supervised learning and unsupervised learning. The main difference between supervised learning and unsupervised learning is that in supervised learning the models are

Method 1

The following example demonstrate how to create a Pandas DataFrame with customized values for each column.

# load packages
import pandas as pd
import numpy as np

# set random seed for reproducibility
np.random.seed(42)

# crate random pandas dataframe
df = pd.DataFrame({'col1': np.random.rand(3), 
                   'col2': np.random.randint(1, 10, 3),
                   'col3': np.random.randn(3)})   
df

       col1      col2      col3
0  0.374540  0.950714  0.731994
1  5.000000  7.000000  3.000000
2 -0.094621 -0.928828 -0.885230

In the above example, we have created a Pandas DataFrame with three columns.

The np.random.rand(3) creates three random values between 0 and 1. The np.random.randint(1, 10, 3) creates three random integers values between 1 and 10. The np.random.randn(3)] creates three random values from a standard normal distribution.

This example creates a Pandas DataFrame with 3 rows and 3 columns, but you can adjust the size and structure of DataFrame as per your requirement.

Method 2

The following example demonstrates how to create a Pandas DataFrame with similar type of values for all columns.

# load packages
import pandas as pd
import numpy as np

# set random seed for reproducibility
np.random.seed(42)

# crate random pandas dataframe
df = pd.DataFrame(np.random.randint(0, 50, size=(5, 3)), 
                  columns=['col1', 'col2', 'col3'])
df   

   col1  col2  col3
0    38    28    14
1    42     7    20
2    38    18    22
3    10    10    23
4    35    39    23

In the above example, we have created a Pandas DataFrame of integer values with three columns. The np.random.randint(0, 50, size=(5, 3) creates 5x3 dimensional array of integer values.

This example creates a Pandas DataFrame with 5 rows and 3 columns, but you can adjust the size and structure of DataFrame as per your requirement.

Method 3

You can also add the categorical variable while creating a random Panda Dataframe.

# load packages
import pandas as pd
import numpy as np

# set random seed for reproducibility
np.random.seed(42)

# crate random pandas dataframe
df = pd.DataFrame({'col1': np.random.rand(3), 
                   'col2': np.random.randint(1, 10, 3),
                   'col3': np.random.choice(['a', 'b'], size=3),
                   })   
df

       col1  col2 col3
0  0.374540     5    a
1  0.950714     7    a
2  0.731994     3    a

In the above example, we have created a Pandas DataFrame of numerical values and categorical values.

The np.random.rand(3) creates three random values between 0 and 1. The np.random.randint(1, 10, 3) creates three random integers values between 1 and 10. The np.random.choice(['a', 'b'], size=3) creates categorical value which are randomly chosen from given list.