applied_machine_learning_public

July 20, 2020 Response

Problem Statement:

Fake news is a prevalent danger in today’s society. The inability to distinguish between factual information and made-up stories posing as news has led to misinformed citizens, and therefore, a misinformed electorate, which is a challenge to democracy. To quell this problem, it is necessary to create a model that distinguishes between news written by professional journalists and that written by non-professionals hoping to spread fake information. The model will be able to predict what category an article falls into for a reader trying to avoid fake news. This can be done through natural language processing by looking for patterns in sentence structure, word choice, and tone. This will require tokenization as well as an embedding layer in the model.

Cats and Dogs:

  1. I used RMSprop as my optimizer and set the learning rate to 0.001. I chose RMSprop because it works better on larger datasets that are split into batches than other optimizers, namely rprop.

  2. For my loss function I used binary cross-entropy. It works by taking the mean of the negative log of the probability of each data point. The probability of each data point is between 0 and 1, as binary cross-entropy applies to data with two classes, meaning that if the probability is 1, there is a 100% chance the data pint has the corresponding label (the true class), and if the probability is 0, there is a 100% chance the data point has the other label. This is an effective method of penalizing bad predictions because as the predicted probability gets closer to zero for the true class (as it gets less accurate), the loss increases exponentially.

  3. The metric argument determines how the model’s performance is judged. Like with the optimizer and loss functions, there are multiple options to choose from to determine how this is implemented. For example, the argument accuracy calculates how often the model’s predictions equal the actual labels.

  4. The model had a training accuracy of almost 98% and a validation accuracy of 60%, so it is clearly very overfit. This is reflected in the graphs below, as the validation set starts to drastically differ from the training set for both accuracy and loss at about 7 epochs.

image

  1. Overall, the model did not perform well in practice, as it was only able to correctly identify one of the three dogs. This is most likely because the model is overfit and unable to generalize enough to properly classify these images. To combat this, we could try employing image augmentation, so the model is trained on a greater sample of images, as images could be rotated, zoomed, or flipped to create more data. We could also apply dropout, which is a regularization technique that makes the distribution of weight values more regular.

Prediction: cat
image

Prediction: cat
image

Prediction: cat
image

Prediction: cat
image

Prediction: cat
image

Prediction: dog
image