Categories
C Ecole 42 Programming Projects

Fillit – Packing Tetris pieces

This Ecole 42 project went faster than expected, I feel like I’m starting to get back into a productive programming mindset.

Description

Fillit is not about recoding Tetris, even if it’s still a variant of this game. This program will take a file as parameter, which contains a list of Tetriminos, and arrange them in order to create the smallest square possible.

The program

The executable “fillit” takes only one parameter, a file which contains a list of Tetriminos to assemble. This file has a very specific format (see input sample).

Input Sample:

Each tetraminos is represented with 4 lines of 4 characters, each followed by a new line. A Tetrimino is a classic piece of Tetris composed of 4 blocks. Each character must be either a block character(’#’) or an empty character (’.’). Each block of a Tetrimino must touch at least one other block on any of his 4 sides (up, down, left and right).

...#
...#
...#
...#

....
....
....
####

.###
...#
....
....

....
..##
.##.
....

....
.##.
.##.
....

....
....
##..
.##.

##..
.#..
.#..
....

....
###.
.#..
....

Output Sample:

The program displays the smallest assembled square on the standard output. To identify each Tetrimino in the square solution, a capital letter will be assigned to each Tetrimino, starting with ’A’ and increasing for each new Tetrimino.

ABBBB.
ACCCEE
AFFCEE
A.FFGG
HHHDDG
.HDD.G

Solution

The program is divided in two parts:A part that reads, validate, organize the input and a second part that solves the problem using backtracking. here is how the program works:

Reading the input and preparing the data

  • The program take the path of the input file as argument, returns usage instruction if argument is blank, error if there are more than 1 arguments.
  • The function main() calls setup(), here the buffer of the file reads 21 bytes while it counts and validate formally the block.
  • Kowning how many blocks are in the input file gen_matrices() allocates a list of size matrices (4×4) and then fill_matrices() fills it with the letters representing the different blocks.
  • The empty rows and column of the matries representig each block are removed by the functions clean_row_matrices() and clean_column_matrices(), (this is probably not the best way to do this).

Solving the problem

  • Takes the list of tetraminos and evaluates how big should be the minimum square that can hold all those pieces.
  • Generates a matrix of size min_size with the fucntion gen_row().
  • Calls the function bfs_big_fucking_solver() wich taking in the emty solution grid and the list of tetraminos solves the problem wihtout allocating any more memory trough backtracking, insertions and restorations.
  • The solution is printed by print_row(row)

Notes

The matrices starts at zero and the standard naming system is used (as shown in the image underneath). Every row is null terminated and the last row is NULL.

Matrix Notation

Example

How to guide

Categories
C Ecole 42 Programming Projects

Libft – my own first C library

“Libft – your own first library” is an individual project at 42, it is the second one i have worked on after the “Piscine Reloaded one”.

The aim of this project is to code a C library regrouping usual functions that we’ll be allowed to use in all our other projects. At 42 we are not allowed to use standard C library function, we can use only function we coded ourself. So the longer term goal of this library is to grow with our own function.

For this library there are only 3 standard library function allowed are write() from <unistd.h>, malloc() and free() from <stdlib.h>. We are allowed also to use <string.h> for accessing size_t and NULL.

Here is the link to the GitHub repository: https://github.com/nmanzini/libft-42

LIBFT is the second project we worked on after starting the school. This project required a lot of checking for all the possible cases in which a standard function can fail or not. This library has been thoroughly checked.

Lib C Functions 42 Functions List Functions
memset ft_memalloc ft_lstnew
bzero ft_memdel ft_lstdelone
memcpy ft_strnew ft_lstdel
memccpy ft_strdel ft_lstadd
memmove ft_strclr ft_lstiter
memchr ft_striter ft_lstmap
memcmp ft_striteri
strlen ft_strmap
strdup ft_strmapi
strcpy ft_strequ
strncpy ft_strnequ
strcat ft_strsub
strlcat ft_strjoin
strchr ft_strtrim
strrchr ft_strsplit
strstr ft_itoa
strnstr ft_putchar
strcmp ft_putstr
strncmp ft_putendl
atoi ft_putnbr
isalpha ft_putchar_fd
isdigit ft_putstr_fd
isalnum ft_putendl_fd
isascii ft_putnbr_fd
isprint
toupper
tolower

 

There are 3 kinds of function:

  • Lib C Functions: are the function coming from standard C libraries have a “ft_” before them to avoid conflict with the original ones.
  • 42 Functions: Functions used at Ecole 42, mainly memory and string manipulation
  • List Functions: Used for list creation and manipulation.

Recreating all these function is a great exercise to understand what these function do in detail, their limitation and unexpected results. Unit testing them is a even greater challenge than writing them.

At 42 we don’t have to study what a function does, when you are stuck for hours recoding them you are forced to understand them deeply.

Categories
Neural Networks Programming Projects Python

Single hidden layer neural network

To test my understanding of Neural Networks and Deep learning I used what i learned form Deep Learing coursera specialization and the code i developed for its assignment to solve the Titanic Kaggle competition.

In the last months i have been following an amazing course held by Andrew Ng for deeplearning.ai on coursera.

This course introduces you to deep learning from the beginning, you just need some basic knowledge of linear algebra adn programming to get started, nothing major but it will help.

I have so far taken the first 3 classes:

Kaggle provide you with two .csv file with the list of passenger of the titanic. one file is for training the neural network and has the information about the passenger, including the information of their survival to the sank. The other file is the test they give to you, there are all the information about the passenger but no information about it survival. Kaggle asks then to upload a .csv file with the survival’s  prediction of every passenger in the test set.

The information provided are:

  • Survival
  • Ticket class
  • Sex
  • Age in years
  • # of siblings / spouses aboard the Titanic
  • # of parents / children aboard the Titanic
  • Ticket number
  • Passenger fare
  • Cabin number
  • Port of Embarkation

From these information i trained a single hidden layer neural network, with Relu activation on the hidden layer and Sigmoid for the last node.

Single layer hidden Neural Network

A single hidden layer neural network consists of 3 layers: input, hidden and output.
The input layer has all the values form the input, in our case numerical representation of price, ticket number, fare sex, age and so on.

In the hidden layer is where most of the calculations happens, every Perceptron unit takes an input from the input layer, multiplies and add it to initially random values. This initial output is not ready yet to exit the perceptron, it has to be activated by a function, in this case a Relu function.

Perceptron

The last and third layer is the output layer, it takes all the previous layer Perceptrons as input and multiplies and add their outputs to initially random values. then gets activated by a Sigmoid function. this layer outputs a value between zero and one, which is the likely in this test that a passenger survives.

Sigmoid and relu activation function

To train the network we relies on gradient descent and backpropagation of the gradients. Here, the output values are compared with the correct answer to compute the value of some predefined error-function.The error compared to the expected final output is then fed back through the network. Using this information, the algorithm adjusts the initially random weights of each connection in order to reduce the value of the error function by some small amount. After repeating this process for a sufficiently large number of training cycles, the network will usually converge to some state where the error of the calculations is small. In this case, one would say that the network has learned a certain target function. To adjust weights properly, one applies a general method for non-linear optimization that is called gradient descent. For this, the network calculates the derivative of the error function with respect to the network weights, and changes the weights such that the error decreases (thus going downhill on the surface of the error function). For this reason, back-propagation can only be applied on networks with differentiable activation functions.

The neural network code programmed is in this file:mono_layer.py. the neural network is all manged trought matrix calcuated thanks to numpy and it doesn’t use any library for the neural network itself (like tensorflow).

In the first testing I noticed that small changes in hyper-parameters change a lot the ability of the neural network training to reach a low cost in short time. Here i will show a graph with random values of learning rate and number of neural units in the hidden layer. I ran only 6 tests to make it for a simpler visualization, here are the costs sampled 10 times in 25000 iterations.

This graph didn’t convinced me too much and i tried then to sample all the costs to see how it was behaving wile going down.

 

Here you can see how the cost behave while it was not sampled for a better visualization. all the tests reach a satisfactory Test_accuracy around 80%

Being deeply interested in this graph i decided to go for 50’000 iterations.

The accuracy after 50’000 iterations seems to rise for most tests. then i decide to continue to reach 100’000 iterations.

In the 100’000 iterations graph we can see that the improvement of the algorithms starts to stagnate. test 3 reaches the best absolute Test_accuracy of 81.84% but test 2 is losing most of it. The algorithm has been over-fitting in the last 50’000 iterations.

 

Here the values stabilizes a lot, but the accuracy didn’t really improved.

This is just the first post i will do about this test and neural networks in general. I submitted the prevision of the test set provided by Kaggle before doing these studies of performance and scored 0.76555 (position on leaderboard 5913). Then i submitted the prediction based on the knowledge i acquired from the new tests and scored 0.79425 (position on leaderboard 2374).