Poefier Week-6

Published in

BBM406 Spring 2021 Projects

2 min readMay 24, 2021

We have come to the end of our project. So far, we have seen how a machine learning project can be done by taking a dataset from scratch and processing it step by step. Before moving on to the last week report, let us briefly examine what we have done week by week so far.

WEEK-1

We started by explaining our subject in the first week. We tried to answer following questions. What do we want to do? Why we chose this subject? What paths we will follow? Also, we talked about our dataset and briefly give information about it. For more information you can check here.

WEEK-2

In week 2 we started to investigate what text classification is and provide some academic papers about it. Then we move on with the investigate our dataset. How many poems in it? What is the distribution of type? What is the distribution of age? And many other features about our dataset represented in this week. Please follow this link for further information.

WEEK-3

Bag of words and word-embedding methods were discussed in week 3. Related academic papers were shared. You can check this link to see it.

WEEK-4

This week we went into practice and wrote some code. These were the most well-known machine learning algorithms. We tested our data with these algorithms and compared them. These algorithms are KNN, W-KNN, Logistic Regression, SVM, Naive Bayes, Decision Tree and Random Forest Classifier, respectively. The results of the algorithms were analyzed in two different categories as age and type. Sample codes related to algorithms were shared. You can click here for more.

WEEK-5

It was decided to use deep learning methods to make our model more complex. As a result of the meetings, we held with the teaching assistant, the decision was made on the BERT algorithm. It was explained what this algorithm is and how it works. Academic papers related to the algorithm were shared. You can see it from the link here.

After all this summary, coming this week, we have modeled the BERT algorithm. We managed to run it on our data. We also shared our results with the course assistant. I want to leave some of these results below.

The results are pretty accurate for age classification but for type classification basic machine learning algorithms had better result. It can be seen in week 4.

It was a tiring but fun process for us. We were very motivated to do such a project in a lesson where we worked and worked hard. We will present the finalized version of the project next week. See you next week!

Dataset

You can reach our dataset from here.

Group Members

Alihan Karatatar — 21904324

Atakan Yüksel — 21627892

Ceren Korkmaz — 21995445

Poefier Week-6

Dataset

Group Members

Written by alihan