Week 9: 10/30 – 11/3

This week I started working on transforming a CSV file of sentences into the data format:

label sentence#1 sentence#2 other_info

I am using pandas to import the csvĀ file and transform it into a data frame:

import pandas as pd
frame = pd.read_csv('Bag_of_Words_model.csv', names = ["Sentences"])

I then insert another column in the beginning as the label – with all zeroes for now since this column will be irrelevant for now, as the Sentence Match Decoder will have to decide this label.

frame.insert(loc=0, column='Values', value=0)

I want to be able to compare every sentence with every other sentence in an article, so I then duplicate the sentences column that I currently have, then shift the values by 1 row.

frame['Compare'] = frame['Sentences']
frame.Compare = frame.Compare.shift(-1)

This then gives me a data frameĀ that looks like so:

This is close to what we need the input data to look like.