### User login

### Search Projects

### Project Members

Meenakshee Mungro | admin |

### Rating the Significance of Detected Network Events

### 03

#### Apr

#### 2014

Once the eventing script had been tested properly, I moved on to including the option of producing a CSV file out of the results.

Afterwards, I wrote a Python script to read in the manually classified events (ground truth) and the results from the eventing script. Then, the entries are compared and matched to produce a list containing the following info: ts event started and severity score(from ground truth), fusion method probabilities (from eventing results, which include DS, bayes, averaging and a newer method which counts the number of detectors that fired and allocates an appropriate severity score), and finally the timestamp at which point each fusion method detected that the event group is significant. This is useful to determine which fusion method performs best (e.g. fastest at detecting significant events, smaller number of FPS(ground truth says not significant, but fusion method detects significance), etc.

The script performs better than I expected after testing (e.g. 46 event groups from the ground truth and 42 matched event group/probability results from the eventing script when tested with Google's stream). The remaining unmatched events will need to be manually sorted out, so hopefully the script will perform as well on AMP data.

### 25

#### Mar

#### 2014

Spent a fair bit of time finishing my detector probability script and making it look less awful. Then, spent the res of the week updating the eventing script to use the new detector probabilities and also updated the initial Sig and FP probabilities used by the Bayes fusion method. Then, added options to the eventing script to allow outputting the values of the different fusion methods, detectors and event grouping methods. The remaining time was spent testing the new changes and fixing a couple of minor bugs.

### 18

#### Mar

#### 2014

Last week was a bit slow, spent ages double-checking the values in the spreadsheet and making sure the correct ell ranges and formulas were being used. Found quite a few mistakes, so it wasn't entirely a waste of time. Then, spent the rest of the week updating the detector probabilities script with the new values. This took forever too since there are several detectors and two methods, each with their own probability values and event grouping, etc.

### 12

#### Mar

#### 2014

Spent first half of the week finishing up events and double-checking up the values and finally got a sufficient sample size for each of the event group categories that I was using. Then, I spent an unfortunate amount of time calculating the values to use for the different belief fusion methods for each of the groups. Turns out that Google Docs is not that smart and doesn't know exactly what I want it to do. So, lots of manual cell address entering, formula rechecking, and whatnot.

Finally, I started updating the eventing script to add the recent changes, e.g. Hidden Markov Model detector. I also added triggers to the eventing script to detect when each of the fusion methods reaches/exceeds a significance probability value (95% currently). This will be used to determine which fusion method is the fastest at declaring which events are significant. Also added code to output the order in which the detectors fired for any event group, since it might be interesting to see if there is a pattern in there.

### 07

#### Mar

#### 2014

Was away for half of the week, so spent Thursday and Friday working on categorizing more streams so that I would have a decent sample size of events for each stream group. This is getting trickier since we are running out of streams and the ones reamining don't necessarily belong to the stream groups that have an insufficient sample size.

### 18

#### Feb

#### 2014

Spent the past 2 weeks collecting more samples of event groups and updated the data in the spreadsheet, so I'll have a better idea of which groups have an insufficient sample size. Andrew had already finalised and entered the data for his HMMDetector for the old streams, so I made sure to include HMM events in the newer streams I analysed (afrinic, lacnic, trademe and apnic).

I also realised I had mislabelled some events detected by the Changepoint detector whenever a loss in measurements occured, so I spent some time double-checking the events and the graphs and updating the appropriate severity value. We decided to exclude them from the detector probability values, since they are a different type of event (similar to LossDetector and Noisy-to-Constant/Constant-To-Noisy updates).

I'll collect more samples (if needed!) and update the values used by the different detectors and fusion methods, and finally move on to validating the output produced by the fusion methods next.

### 04

#### Feb

#### 2014

Last week was rather short (holiday and unwell for a couple of days). Spent a bit of time looking into other fusion methods, but then decided to take a break and look into writing the eventing script's output to a database (for easier inspection). Talked to Shane and he created a separate database that I could play with, just to be safe. After looking at their current schema, I spent some time thinking about an ideal way of storing the probabilities of the different methods in the database. Finally, finalised the schema, created the tables and started working on inserting the event data into the DB.

### 23

#### Jan

#### 2014

Figured out how to use Bayes as a method to combine the beliefs/probabilities to obtain a final significance probability out of the results of several detectors. I had to use different values than the DS ones I had previously calculated, so I spent a while calculating and double-checking the values I needed for Bayes. After that, I did some manual calculations/testing before diving into implementing it in the eventing python script.

Also read a few other other papers regarding different methods of belief fusion, namely the Averaging and Cumulative functions. After talking to Richard, we decided to implement those functions so as to compare the values obtained by each method.

I also read some material on Fuzzy Logic, so I plan on implementing that next.

Modified the eventing script to enable easy addition of different belief fusion methods, since I plan on implementing more methods as I come across them.

### 13

#### Jan

#### 2014

Started the week by doing a summary of the Smokeping data that Shane and I have collected last year. This included grouping the streams based on average means (i.e. < 5, < 30, < 100, > 100) and summing up the number of FPs and significant/insignificant/unclassified events for the whole stream and also on a per detector basis. Using these numbers, I was able to find out accurate probability values for each detector. This also made it easy to see exactly where we needed more data, e.g. only having 5 Mode events throughout all the streams with an avg mean of < 5.

Then, I modified my eventing python script to use different probabilities based on the detector that fired and the average mean of the stream at that time. These probability values will still need to be updated later on since the sample size is too small for some of the detectors. However, this is tricky since some detectors (especially Mode) only fire occasionally when the mode of the time series has changed considerably, so getting a big enough sample size is tricky.

Spent some time looking over Bayes Theorem, which I plan on using as a comparison of different fusion methods.

### 17

#### Dec

#### 2013

Spent a fair amount of time reading papers on the limits and alternatives to Dempster-Schafer for combining evidence. The main limitation of D-S is that it can produce counter-intuitive results in case of strong conflict between argument beliefs. However, there are no elements of conflict in the belief functions for the detectors, which makes D-S the preferred option (until I find a better alternative). I also came across a number of other rules (Bayesian, fuzzy logic, TBM, etc) that I plan on reading about during the break.

Also spent a considerable amount of time looking at the events for the AMP-ICMP graphs for the Quarterpounder to Google.com stream. There are a huge number of events, which makes grouping and rating them take forever. I need to do more of the amp-icmp stream analysis before calculating the belief values for each detector, and that's something that I plan on doing during the break too.

### 09

#### Dec

#### 2013

I met with Dr. Joshi from the Stats Dept and confirmed that the method I was using was indeed correct. He also mentioned looking into Bayes' theorem as an alternative, and I spent some time reading up on it. There is an element of "undecidedness" with the event significance, which is why Dempster-Schafer is more appropriate than Bayes'.

Also updated Netevmon to periodically send out mean updates to the eventing script. These mean values will be used in deciding which probability values to use in different cases (e.g. when the measurements are noisy/constant, etc). Also also, looked at and rated the events for a couple of streams and updated some of the "busier" streams with last week's events.

### 02

#### Dec

#### 2013

Spent the first half of the week implementing a version of the Dempster-Schafer belief function in the eventing Python script. After debugging and testing to make sure that it worked properly, I went on to analysing the events for a few Smokeping streams. This consisted of finding the start of a detected event, finding it in the AMP graphs, giving it a significance rating of 0-5, with 0 being a FP and 5 being a very significant event, and then entering details of the event group in a spreadsheet. This was rather tedious and depending on the stream, sometimes took forever.

I plan on Seeing Dr. Joshi from the Stat Dept. next week to confirm the Dempster-Schafer calculations, after which I will have to resume the event analysing.

### 19

#### Nov

#### 2013

Finalised the TEntropy detector and committed the changes to Git. Then spent the next few days reading up on belief theory and the Demspter Schafer belief functions. Started working on a Python script for a server that listens for new connections, understands the protocol used by AnomalyExporter and parses the event data received fron the anomaly_ts client.

Plan for the next week is to finish the eventing Python script (which will include event grouping by ts and stream # initially) and start gathering data from the events in order to calculate a confidence level for each detector. This is necessary for using belief theory to determine whether an event is really an event since the various detectors might not produce the same results.

### 12

#### Nov

#### 2013

Spent the week testing the TEntropy detector on different streams and refining the Symboliser to reduce the number of FPs.

One problem that I discovered was that the magnitude of the events were not represented correctly by the Symboliser: a severe event had the same t-entropy result as a trivial one since only 1 character was inserted at a time. Hence, a solution to this was the introduction of multiple characters based on the severity of the change.

Another problem was that small, insignificant changes were triggering events when deally, they should have not been creating entropy. Hence, I added a condition that checks whether a measurement is significantly different from the previous mean before triggering a non-default character.

### 06

#### Nov

#### 2013

The past couple of weeks have been less productive than I would have liked, whatwith it being the end of the semester and final assignments and exam marking needing to be done. I managed to integrate the Symboliser and TEntropyCalculator with the existing Netevmon code, which produced an average t-entropy value that was then put through a PlateauLevelDetector. However, there were a considerate number of False Positives and False Negatives which were rather discouraging. I knew that the actual t-entropy calculation was correct (after running the original source code and comparing values), so the problem had to be with the strings that were produced by the Symboliser. This meant that I had to spend a lot of time tinkering with the metrics and different parameters of the Symboliser to improve the strings produced in such a way that it would accurately represent the nature of the traffic.

Plan for the following week: keep on refining the metrics and experiment with different parameter values (e.g. shorter/longer string lengths, etc). And also possibly start reading up on the Demspter-Schafer belief theory and finding a decent detector that might be implemented by one of the Summer Research students.

### 15

#### Oct

#### 2013

Managed to get a working implementation of Flott which does the necessary initialisation and calculations for obtaining the t-entropy of a given string! It took longer than expected though - I was right about the objects and functions that I would need out of the original source code, but missed a number of lines in different places which meant that the tokens and values used in the calculations were incorrect, thus resulting in an incorrect output. So, I spent many, many hours adding debugging output in my implementation and the original code after each iteration/processing and compared the results to figure out what had gone wrong. I was then able to produce a t-entropy value that was very close to the original program's output. After going over the original code again, there was a scaling factor that I had missed and that fixed the last issue.

Over the next week, the plan is to refactor the code and finalise it for addition to Netevmon.

### 08

#### Oct

#### 2013

Over the last two weeks, I have been working on the TEntropy detector.

During the first week, I used anomaly_ts and anomaly_feed and produced output for a number of different streams by using a combination of different metrics, string lengths, sliding window sizes, and range delimiters. After producing strings for each sliding window sample, a python script calls the external t_Entropy function with the string as a parameter to obtain the average t-Entropy for each string and pipes the output to a file. I then wrote another Python script to produce a Gnuplot script for producing time-series graphs so that I could inspect the results. At this point, it was apparent that the t-Entropy detector was a feasible option and hence, I had to start implementing the actual t-entropy calculations within Netevmon.

Spent last week going over the T_Entropy library that I found called Fast Low Memory T-transform (flott), which is used to compute the T-complexity of a string which in turn is used to compute the t-Entropy. Unfortunately, the library consisted of around a dozen .c and header files, which made it somewhat tricky to determine which parts I would need. So, I spent around 3 days looking over the source code and trying to understand it before starting to work on adding the necessary bits to a new detector. Found the function that is used for calculating the actual t-complexity, t-information and t-entropy values, so have been working on duplicating those calculations. However, there are a number of other initialisation functions that are required before the t-* can be calculated, so I have to look into them at some point.

Also had a bunch of marking to do, so couldn't spend all week working on the flott adaptation.

### 24

#### Sep

#### 2013

Still working on the tEntropy detector, but have made good progress this week. Ironed out any bugs that I found, and have output in the correct format. Then, I spent a great deal of time collecting output for 8 different streams, each with different character bin sizes and string lengths. Also wrote a python script which takes the output files for different streams (which includes the string used for entropy measurements) and passes it to an external script which calculates an average t-entropy measurement for each timestamp. So, I now have a bunch of output files with entropy values that need to be plotted to determine which combination of string lengths and character bin sizes would be most optimal.

After a brief look at a couple of graphs, it seemed that a greater string length(50) had no benefits over using a smaller string size (20). The patterns were practically similar for each string length and differed very little, which implies that the additional computational cost of calculating the t-entropy for 50characters for every single timestamp is not worth it.

### 17

#### Sep

#### 2013

Spent the week working on the TEntropy Detector. Added a few different metrics that will be used to determine the most suitable/appopriate combination of metrics (by trial and error). Choosing the correct metric would allow transforming the samples into a time-series of average entropy values, and these will be used to detect anomalies. Worked on converting the characters and started implementing a buffer to store the characters as they are added.

### 10

#### Sep

#### 2013

Read a paper about T-Entropy and started implementing a detector that uses sliding windows, calculates some statistics, assigns an appropriate character/"class" to each window, which will then be concatenated into a string of characters, which will in turn be used to obtain the average T-Entropy for a sliding window. However, NNTSC/Netevmon was down until Wednesday so I didn't get to test it after that.

The rest of the week was spent taking care of GA duties, marking a ridiculous amount of assignments, updating Moodle grades, yadda yadda. Didn't manage to get any work done on the project, unfortunately.

Next week, I plan on working on the T-Entropy detector some more, especially adding new statistics and trying to figure out a combination of stats that "work".