User login

Blogs

24

Jul

2017

Had a quick look at moving the BGP code from python2.7 to python3, and it doesn't look like it will require much work. Didn't go ahead though, as moving to python3 appears to also require moving to exabgp4 which is still rather poorly documented. Probably still worth a look, sooner rather than later.

Spent some time hardening prefix creation to deal with incorrectly formed prefixes, and wrote some unit tests to make sure that it behaves correctly. Also updated unit tests for filters to bring them up to date with recent changes. Wrote some brief overview documentation to describe how it all fits together ahead of Florin starting work on this next week.

10

Jul

2017

Created configuration within my BGP program to peer with routers both internal to my test network and external, and to filter/distribute routes between them. I had to slightly change the way that the network topography was described to better account for loopback addresses and routed interfaces (my earlier networks were just graphs of node names), and update the nexthop calculations to take this into account (especially in the case where my BGP speaker was not directly connected to the external peer). Routes can now be sent around my test network, and filtered or modified as they are received/sent to peers.

Updated and pushed out some updates to the ampweb packages to fix some reported bugs.

10

Jul

2017

Brad set up a physical network using the lab Junipers for me to test my BGP code on, and I made a start working with it. Configured interfaces and set up multihop BGP to get the external peers talking to my controller running a few hops inside the network.

Spent most of the week working on the performance report.

10

Jul

2017

Put together an architecture diagram to help talk about how the BGP program fits together. Spent some more time on scaling - some bits had slow custom implementations of things that were no longer needed, and so were removed.

Spent most of the week working on a performance report based on some measurement data we are collecting, writing code to extract the specific data and generate the appropriate graphs.

10

Jul

2017

Brought the network/topology management into line with the new multi-process model. The topology can now be modified on the fly, which will generate messages to various peers allowing them to re-run their route selection algorithms to take the changes into account.

Removed the distinction between internal and external nodes, and reworked the classes to allow multiple node types, depending on how we want to manage them. We could speak BGP to some, Openflow (or other SDN type protocol) to others, etc.

Spent some more time looking at improving memory usage. Found what I consider a bug in the python multiprocessing queue implementation - a reference to the last message sent to the queue is kept around when it is no longer required, continuing to use memory. I send small infrequent but large messages, which means a lot of memory could be tied up for quite some time for no useful purpose.

28

Jun

2017

Libprotoident 2.0.11 has been released.

Firstly, this release updates the existing tools to be compatible with both libflowmanager 3 and parallel libtrace. This means that the tools can now take advantage of any parallelism in the traffic source, e.g. streams on a DAG card or a DPDK-capable NIC.

Secondly, we've added 61 new application protocols to our set of detectable protocols, bringing the total supported number of applications to 407. A further 25 existing protocols have been updated to better match new observed traffic patterns.

Finally, there have been a couple of minor bug fixes as well.

Note that this release will require both libflowmanager 3 and libtrace 4, which means that you will likely have to upgrade these libraries prior to installing libprotoident 2.0.11. If this is problematic for you but you still want the new application protocol rules, you can use the '--with-tools=no' option when running ./configure to prevent the tools (which are the reason for the upgraded dependencies) from being built.

The full list of updated protocols can be found in the libprotoident ChangeLog.

Download libprotoident 2.0.11 here!

12

Jun

2017

Spent most of the week working on the BGP program. Had a bit of a general tidy up and reorganisation of the class hierarchy, and updated unit tests to match the changes.

All the various copies of routing tables are now stored on disk when not actively being modified to try to save on memory usage (and to make recovery easier in the future if BGP connections are interrupted). The Python garbage collector generally doesn't seem keen to return memory to the operating system though, so processes still end up with something of a high water mark for memory usage but this does improve it slightly.

Finished adding a command interface to allow updating filters on the fly, as well as any other operations we want to add (could inject crafted BGP messages, swap out parts of the decision process etc). Peer objects rerun the filters over received routes as they are updated. Peer objects also now run a BGP decision process to determine the best routes to export, and make sure that their own ASN is not in the path.

12

Jun

2017

Try to save some more memory in my ipaddress module by calculating netmasks as required rather than storing them. Storing the AS path as an array rather than a list can also save considerable amounts of memory in my route entries.

Decided it was easier to send the full current state of routes between peers and VRFs rather than incremental updates. It means the state is always up to date and we don't need to keep track per peer or VRF when there are one-to-many relationships and peers might come and go at different times. Passing 1 million routes between processes takes milliseconds which is plenty fast enough.

Fixed a bug in the equality/hashing functions for route entries that meant they would never match and so all routes were being withdrawn and re-advertised to peers any time there were changes to be made.

12

Jun

2017

Moved the peer and VRF route management out into a separate process for each individual one, so that they can process filters etc in parallel without blocking other peers/VRFs. They are all self-contained now and operate via message passing - BGP commands or routes come in, which are processed and then sent on as further BGP commands or route lists.

Spent some time tracking down the cause of ports not being reused correctly in the AMP throughput test. When run through the server, with IPv4 and IPv6 available, it was not properly closing the socket for the unused address family once a client connected so the test port would still be in use when it later tried to restart the connection to test in the other direction. It should now make sure that only the address family in use has a socket bound to the test port.

12

Jun

2017

Replaced the python ipaddress module in my BGP program with my own very minimal one to reduce memory usage. Replaced some empty sets/lists that may not ever have data with "None" by default as empty data structures are quite memory heavy when you have millions of them. Also updated a couple of heavily used classes to use slots (explicitly stating the attributes) rather than leaving it open ended (and using more memory).

Started looking at adding a command interface to allow updating filters and receiving external measurements or metadata about how we should be routing traffic. The current event loop around I/O doesn't really support this (and has other issues about deadlocking with exabgp) so needed to be rewritten. All exabgp reading and writing now happens in the same place, and in a different thread to the command interface and route management.