🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

Back to Artificial Intelligence

Binary vs one hot

Shawn Halayka · 2023-07-20T20:30:00

I’m making a cat vs dog image classification using PyTorch. Should I have only one output variable, or should I have two (aka one hot)?

Artificial Intelligence Programming binary onehot

Started by taby July 01, 2023 04:56 PM

52 comments, last by taby 11 months, 1 week ago

taby

1,510

Author

July 16, 2023 12:53 AM

Also, should the learning rate be the reciprocal of the number of epochs?

fleabay

1,327

July 16, 2023 01:00 AM

taby said:
Do you do networks in C++?

LOL, I thought you were asking off-topic question about general networking and not neural networks. That's why I ig'ed you.

Anyway, I've just a passing knowledge about machine learning stuff.

🙂🙂🙂🙂🙂<←The tone posse, ready for action.

taby

1,510

Author

July 16, 2023 01:04 AM

That’s ok. I know sockets already. ?

alvaro

21,604

July 16, 2023 01:37 PM

taby said:
Also, should the learning rate be the reciprocal of the number of epochs?

No, I don't think those two things are closely connected. I think it has more to do with what optimizer you use and the size of the network. You will need to experiment a bit to see what works in your situation, but I can tell you what I have learned over the years.

In order to set the initial learning rate, I usually try to start training with a variety of learning rates, only for a few seconds or minutes each. Very large learning rates will result in explosions where weights go to infinity, or you get NaNs. Find the approximate largest learning rate where things don't seem to explode immediately and then divide it by 3 or so for safety. I would then plan to start training with that initial learning rate, so you can make a lot of progress quickly.

In the kind of network that you are using, I believe people tend to use an adaptive optimizer, like Adam. Then a popular learning-rate schedule is to drop it two or three times suddenly during the training. For example, 0.005 for a while (if that's what came out of the previous procedure), then 0.001 for a while, then 0.0001 for a bit at the end, so the final weights have less noise.

There is another style of training where you make the network quite large and then train until it completely memorizes all the labels. Traditional statistics would say that overfitting will lead to bad performance in out-of-sample data. However, if you use a very simple SGD optimizer and keep the learning rate pretty high for the whole training, this can actually perform well in out-of-sample data. There are some theoretical papers explaining the reasons for this, and I can try to find one if you are interested. I don't have any personal experience with this type of training.

taby

1,510

Author

July 16, 2023 04:12 PM

Thank you for your expertise.

taby

1,510

Author

July 16, 2023 09:31 PM

I was thinking of doing x number of networks, all with different random seeds, but same learning rate. Then the very best of those x is used In the next round. It’s worth a try.

taby

1,510

Author

July 17, 2023 01:23 AM

So should it be a race to lower the loss amount?

taby

1,510

Author

July 17, 2023 07:16 PM

Better yet, in the case of a tie with regard to loss, the network is to pick the one with the highest of derivative of loss.

taby

1,510

Author

July 18, 2023 12:45 AM

A basic code for doing network fitness based on loss:

import numpy as np
import math
import cv2
import random
import torch
from torch import flatten
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F

import os.path
from os import path

import time




img_width = 32
num_channels = 3

#num_input_components = img_width*img_width*num_channels
num_output_components = 2

num_epochs = 10
learning_rate = 0.001

max_train_files = 1000

num_recursions = 10
num_child_networks = 5




class Net(torch.nn.Module):

	def __init__(self, num_channels, num_output_components):

		super().__init__()
		self.model = torch.nn.Sequential(
		    #Input = 3 x 32 x 32, Output = 32 x 32 x 32
		    torch.nn.Conv2d(in_channels = num_channels, out_channels = 32, kernel_size = 3, padding = 1), 
		    torch.nn.ReLU(),
		    #Input = 32 x 32 x 32, Output = 32 x 16 x 16
		    torch.nn.MaxPool2d(kernel_size=2),
  
		    #Input = 32 x 16 x 16, Output = 64 x 16 x 16
		    torch.nn.Conv2d(in_channels = 32, out_channels = 64, kernel_size = 3, padding = 1),
		    torch.nn.ReLU(),
		    #Input = 64 x 16 x 16, Output = 64 x 8 x 8
		    torch.nn.MaxPool2d(kernel_size=2),
		      
		    #Input = 64 x 8 x 8, Output = 64 x 8 x 8
		    torch.nn.Conv2d(in_channels = 64, out_channels = 64, kernel_size = 3, padding = 1),
		    torch.nn.ReLU(),
		    #Input = 64 x 8 x 8, Output = 64 x 4 x 4
		    torch.nn.MaxPool2d(kernel_size=2),
  
		    torch.nn.Flatten(),
		    torch.nn.Linear(1024, 256),
		    torch.nn.ReLU(),
		    torch.nn.Linear(256, num_output_components)
		)
  
	def forward(self, x):
		return self.model(x)








class float_image:

	def __init__(self, img):
		self.img = img

class image_type:

	def __init__(self, img_type, float_img):
		self.img_type = img_type
		self.float_img = float_img





def do_network(in_net, batch, ground_truth, num_channels, num_output_components, all_train_files, random_seed, num_epochs):

	if (in_net is None):
		in_net = Net(num_channels, num_output_components)

	net = in_net

	random.seed(random_seed)

	optimizer = torch.optim.Adam(net.parameters(), lr = learning_rate)
	loss_func = torch.nn.MSELoss()

	loss = 0;

	for epoch in range(num_epochs):
		
		random.shuffle(all_train_files)

		count = 0

		for i in all_train_files:

			batch[count] = i.float_img
		
			if i.img_type == 0: # cat

				ground_truth[count][0] = 1
				ground_truth[count][1] = 0
			
			elif i.img_type == 1: # dog
				
				ground_truth[count][0] = 0
				ground_truth[count][1] = 1

			count = count + 1
	
		x = Variable(torch.from_numpy(batch))
		y = Variable(torch.from_numpy(ground_truth))

		prediction = net(x)	 
		loss = loss_func(prediction, y)

		print(epoch, loss)

		optimizer.zero_grad()	 # clear gradients for next train
		loss.backward()		 # backpropagation, compute gradients
		optimizer.step()		# apply gradients
	
	return net, loss






if False: #path.exists('weights_' + str(img_width) + '_' + str(num_epochs) + '.pth'):
	net.load_state_dict(torch.load('weights_' + str(img_width) + '_' + str(num_epochs) + '.pth'))
	print("loaded file successfully")
else:
	print("training...")





	all_train_files = []




	file_count = 0

	path = 'training_set/cats/'
	filenames = next(os.walk(path))[2]

	for f in filenames:

		file_count = file_count + 1
		if file_count >= max_train_files:
			break;

		print(path + f)
		img = cv2.imread(path + f)
		
		if (img is None) == False:

			img = img.astype(np.float32)
			res = cv2.resize(img, dsize=(img_width, img_width), interpolation=cv2.INTER_LINEAR)
			flat_file = res / 255.0
			flat_file = np.transpose(flat_file, (2, 0, 1))
			all_train_files.append(image_type(0, flat_file))

		else:
			print("image read failure")





	file_count = 0

	path = 'training_set/dogs/'
	filenames = next(os.walk(path))[2]

	for f in filenames:

		file_count = file_count + 1
		if file_count >= max_train_files:
			break;

		print(path + f)
		img = cv2.imread(path + f)
		
		if (img is None) == False:

			img = img.astype(np.float32)
			res = cv2.resize(img, dsize=(img_width, img_width), interpolation=cv2.INTER_LINEAR)
			flat_file = res / 255.0
			flat_file = np.transpose(flat_file, (2, 0, 1))
			all_train_files.append(image_type(1, flat_file))

		else:
			print("image read failure")




	batch = np.zeros((len(all_train_files), num_channels, img_width, img_width), dtype=np.float32)
	ground_truth = np.zeros((len(all_train_files), num_output_components), dtype=np.float32)	

	curr_net, curr_loss = do_network(None, batch, ground_truth, num_channels, num_output_components, all_train_files, round(time.time()), num_epochs)

	for y in range(num_recursions):
		for x in range(num_child_networks):

			print(y, x)

			net, loss = do_network(curr_net, batch, ground_truth, num_channels, num_output_components, all_train_files, round(time.time()), num_epochs)

			if loss < curr_loss:

				curr_loss = loss
				curr_net = net







#	torch.save(net.state_dict(), 'weights_' + str(img_width) + '_' + str(num_epochs) + '.pth')



path = 'test_set/cats/'
filenames = next(os.walk(path))[2]

cat_count = 0
total_count = 0

for f in filenames:

	img = cv2.imread(path + f)
			
	if (img is None) == False:

		img = img.astype(np.float32)
		res = cv2.resize(img, dsize=(img_width, img_width), interpolation=cv2.INTER_LINEAR)
		flat_file = res / 255.0
		flat_file = np.transpose(flat_file, (2, 0, 1))

	else:

		print("image read failure")
		continue

	batch = torch.zeros((1, num_channels, img_width, img_width), dtype=torch.float32)
	batch[0] = torch.from_numpy(flat_file)

	prediction = curr_net(Variable(batch))

	if prediction[0][0] > prediction[0][1]:
		cat_count = cat_count + 1

	total_count = total_count + 1

print(cat_count / total_count)
print(total_count)





path = 'test_set/dogs/'
filenames = next(os.walk(path))[2]

dog_count = 0
total_count = 0

for f in filenames:

	img = cv2.imread(path + f)
			
	if (img is None) == False:

		img = img.astype(np.float32)
		res = cv2.resize(img, dsize=(img_width, img_width), interpolation=cv2.INTER_LINEAR)
		flat_file = res / 255.0
		flat_file = np.transpose(flat_file, (2, 0, 1))

	else:

		print("image read failure")
		continue

	batch = torch.zeros((1, num_channels, img_width, img_width), dtype=torch.float32)
	batch[0] = torch.from_numpy(flat_file)

	prediction = curr_net(Variable(batch))

	if prediction[0][0] < prediction[0][1]:
		dog_count = dog_count + 1

	total_count = total_count + 1

print(dog_count / total_count)
print(total_count)

taby

1,510

Author

July 18, 2023 12:58 AM

With CUDA support enabled:

import numpy as np
import math
import cv2
import random
import torch
from torch import flatten
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F

import os.path
from os import path

import time




img_width = 32
num_channels = 3

#num_input_components = img_width*img_width*num_channels
num_output_components = 2

num_epochs = 10
learning_rate = 0.001

max_train_files = 100000

num_recursions = 10
num_child_networks = 5


class Net(torch.nn.Module):

	def __init__(self, num_channels, num_output_components):

		super().__init__()
		self.model = torch.nn.Sequential(
		    #Input = 3 x 32 x 32, Output = 32 x 32 x 32
		    torch.nn.Conv2d(in_channels = num_channels, out_channels = 32, kernel_size = 3, padding = 1), 
		    torch.nn.ReLU(),
		    #Input = 32 x 32 x 32, Output = 32 x 16 x 16
		    torch.nn.MaxPool2d(kernel_size=2),
  
		    #Input = 32 x 16 x 16, Output = 64 x 16 x 16
		    torch.nn.Conv2d(in_channels = 32, out_channels = 64, kernel_size = 3, padding = 1),
		    torch.nn.ReLU(),
		    #Input = 64 x 16 x 16, Output = 64 x 8 x 8
		    torch.nn.MaxPool2d(kernel_size=2),
		      
		    #Input = 64 x 8 x 8, Output = 64 x 8 x 8
		    torch.nn.Conv2d(in_channels = 64, out_channels = 64, kernel_size = 3, padding = 1),
		    torch.nn.ReLU(),
		    #Input = 64 x 8 x 8, Output = 64 x 4 x 4
		    torch.nn.MaxPool2d(kernel_size=2),
  
		    torch.nn.Flatten(),
		    torch.nn.Linear(1024, 256),
		    torch.nn.ReLU(),
		    torch.nn.Linear(256, num_output_components)
		)
  
	def forward(self, x):
		return self.model(x)








class float_image:

	def __init__(self, img):
		self.img = img

class image_type:

	def __init__(self, img_type, float_img):
		self.img_type = img_type
		self.float_img = float_img





def do_network(in_net, batch, ground_truth, num_channels, num_output_components, all_train_files, random_seed, num_epochs):

	if (in_net is None):
		in_net = Net(num_channels, num_output_components)

	net = in_net

	random.seed(random_seed)

	optimizer = torch.optim.Adam(net.parameters(), lr = learning_rate)
	loss_func = torch.nn.MSELoss()

	loss = 0;

	net.to(torch.device("cuda:0"))

	for epoch in range(num_epochs):
		
		random.shuffle(all_train_files)

		count = 0

		for i in all_train_files:

			batch[count] = i.float_img
		
			if i.img_type == 0: # cat

				ground_truth[count][0] = 1
				ground_truth[count][1] = 0
			
			elif i.img_type == 1: # dog
				
				ground_truth[count][0] = 0
				ground_truth[count][1] = 1

			count = count + 1
	
		x = Variable(torch.from_numpy(batch))
		y = Variable(torch.from_numpy(ground_truth))
		x = x.to(torch.device("cuda:0"))
		y = y.to(torch.device("cuda:0"))


		prediction = net(x)	 
		loss = loss_func(prediction, y)

		print(epoch, loss)

		optimizer.zero_grad()	 # clear gradients for next train
		loss.backward()		 # backpropagation, compute gradients
		optimizer.step()		# apply gradients
	
	return net, loss






if False: #path.exists('weights_' + str(img_width) + '_' + str(num_epochs) + '.pth'):
	net.load_state_dict(torch.load('weights_' + str(img_width) + '_' + str(num_epochs) + '.pth'))
	print("loaded file successfully")
else:
	print("training...")


	if torch.cuda.is_available():
		dev = "cuda:0"
	else: 
		dev = "cpu"

	device = torch.device(dev)

	print(device)




	all_train_files = []




	file_count = 0

	path = 'training_set/cats/'
	filenames = next(os.walk(path))[2]

	for f in filenames:

		file_count = file_count + 1
		if file_count >= max_train_files:
			break;

		print(path + f)
		img = cv2.imread(path + f)
		
		if (img is None) == False:

			img = img.astype(np.float32)
			res = cv2.resize(img, dsize=(img_width, img_width), interpolation=cv2.INTER_LINEAR)
			flat_file = res / 255.0
			flat_file = np.transpose(flat_file, (2, 0, 1))
			all_train_files.append(image_type(0, flat_file))

		else:
			print("image read failure")





	file_count = 0

	path = 'training_set/dogs/'
	filenames = next(os.walk(path))[2]

	for f in filenames:

		file_count = file_count + 1
		if file_count >= max_train_files:
			break;

		print(path + f)
		img = cv2.imread(path + f)
		
		if (img is None) == False:

			img = img.astype(np.float32)
			res = cv2.resize(img, dsize=(img_width, img_width), interpolation=cv2.INTER_LINEAR)
			flat_file = res / 255.0
			flat_file = np.transpose(flat_file, (2, 0, 1))
			all_train_files.append(image_type(1, flat_file))

		else:
			print("image read failure")




	batch = np.zeros((len(all_train_files), num_channels, img_width, img_width), dtype=np.float32)
	ground_truth = np.zeros((len(all_train_files), num_output_components), dtype=np.float32)	

	curr_net, curr_loss = do_network(None, batch, ground_truth, num_channels, num_output_components, all_train_files, round(time.time()), num_epochs)

	for y in range(num_recursions):
		for x in range(num_child_networks):

			print(y, x)

			net, loss = do_network(curr_net, batch, ground_truth, num_channels, num_output_components, all_train_files, round(time.time()), num_epochs)

			if loss < curr_loss:

				curr_loss = loss
				curr_net = net







#	torch.save(net.state_dict(), 'weights_' + str(img_width) + '_' + str(num_epochs) + '.pth')



path = 'test_set/cats/'
filenames = next(os.walk(path))[2]

cat_count = 0
total_count = 0

for f in filenames:

	img = cv2.imread(path + f)
			
	if (img is None) == False:

		img = img.astype(np.float32)
		res = cv2.resize(img, dsize=(img_width, img_width), interpolation=cv2.INTER_LINEAR)
		flat_file = res / 255.0
		flat_file = np.transpose(flat_file, (2, 0, 1))

	else:

		print("image read failure")
		continue

	batch = torch.zeros((1, num_channels, img_width, img_width), dtype=torch.float32)
	batch[0] = torch.from_numpy(flat_file)

	prediction = curr_net(Variable(batch))

	if prediction[0][0] > prediction[0][1]:
		cat_count = cat_count + 1

	total_count = total_count + 1

print(cat_count / total_count)
print(total_count)





path = 'test_set/dogs/'
filenames = next(os.walk(path))[2]

dog_count = 0
total_count = 0

for f in filenames:

	img = cv2.imread(path + f)
			
	if (img is None) == False:

		img = img.astype(np.float32)
		res = cv2.resize(img, dsize=(img_width, img_width), interpolation=cv2.INTER_LINEAR)
		flat_file = res / 255.0
		flat_file = np.transpose(flat_file, (2, 0, 1))

	else:

		print("image read failure")
		continue

	batch = torch.zeros((1, num_channels, img_width, img_width), dtype=torch.float32)
	batch[0] = torch.from_numpy(flat_file)

	prediction = curr_net(Variable(batch))

	if prediction[0][0] < prediction[0][1]:
		dog_count = dog_count + 1

	total_count = total_count + 1

print(dog_count / total_count)
print(total_count)

🎉 Celebrating 25 Years of GameDev.net! 🎉

Binary vs one hot

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

🎉 Celebrating 25 Years of GameDev.net! 🎉

Binary vs one hot

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines