User login

Weekly Report - 24/7/15




Firstly I made the MultinomialNiaveBayes sort by confidence and print it's predictions in order of confidence to system out. The application allows for multiple files to be used for training and once trained another file can be specified for testing.

Next to allow the user to correct the mistakes of the program I made a GUI which displays the output in list form to be displayed in a tabular format. The user is then able to go through each event and update whether or not it is safe. Finally once the user is happy they can save the instances to an arff file so it can be fed back in for training allowing it to improve from its previous mistakes.

Now I'm looking into grouping events that occur within some time period e.g. 60 seconds of one another from multiple files to see how prevalent the connections between multiple file types are. Some measure of similarity between events will need to be used so each "word" will be treated as a token to compare two events, this can be adjusted to weight certain parts of events e.g. treat an IP address as 4 separate parts so instead of just having a weight of one if the whole thing matches it would now have a weight of four or if just the network section matches a weight of two.

To be able to find the time between each event I will need to convert the time stamps into a usable format which will need to be customised for log files with different time stamp formats.