The business of NetFlow Deduplication (aka Flow Deduplication) during collection time faces an interesting paradigm.  It revolves around enterprise NetFlow collection and scalability. This is a follow up to the post on NetFlow Direction which really should be read first because it outlines one of the primary reasons for NetFlow deduplication.  In that post flow deduplication and stitching were also defined.

 

Some of the largest issues facing the NetFlow deduplication philosophy are:

 

  • The introduction of distributed collectors means deduplication across collectors.
  • Asymmetrical paths exacerbate the issue. Asymmetrical paths are when the flow from A to B goes through one or more different routers or switches that aren’t in the path from B back to A.
  • The emergence of NetFlow v9 and IPFIX introduced a nearly limitless amount of new elements that prevents most flows from being deduplicated. This is because the contents of what is saved about a flow can change from router to router and likely will!  If these flows are deduplicated, many of the details are lost and will require pointers back to the original flow.

 

Flow Stitching is Fallible

If the routing or switch fabric causes asymmetrical connection paths, flow stitching is supposed to resolve this dilemma. In short, Flow stitching finds the return flow which could be through another router and creates a biflow from two unidirectional flows.  But, this strategy faces a problem that is explained below.

 

Flow Deduplication Doesn’t Scale Across Collectors

If the return flow path is through router(s) that are exporting to different NetFlow collectors, deduplication and stitching needs to be done across collectors.  Without deduplication and stitching across collectors some flows won’t get deduplicated while others won’t get stitched due to missing evidence of a return flow.  Hence, utilization can be over stated and the return path for the deduplicated/stitched flow entry could end up being marked as unknown.  There is a solution: these problems can be addressed if deduplication and stitching are performed adhoc – across collectors.

 

Flow Volumes are Increasing

The volume of flows exported on networks is growing and vendors like Cisco and others are increasing the volume of elements in their rich flow exports.  A single IPFIX/NetFlow exporter today can send over 1M flows per second, maybe even 10 million in the not so distant future. In most cases, this means that a single exporter will have to round robin the flows between potentially dozens of NetFlow/IPFIX collectors.  The Cisco NGA 3240 is already doing this. Theoretically we may need deduplication and stitching across potentially hundreds of collectors. The thought of trying to perform this task across all the collectors seems nearly ineffable. If however queries are performed on demand and the processing necessary to complete queries is distributed, we can still expect real-time results.

 

NetFlow Deduplication Solution

The solution to NetFlow Deduplication is to save 100% of the data and be able to search on 100% of it within seconds across all of the distributed flow collectors.  Loaded with all the data related to what is being searched on allows the GUI to trace the flow in both directions.  This means deduplication and stitching is performed as needed.

 

The same process mentioned above is leveraged in threat detection with NetFlow/IPFIX.  When uncovering unwanted network behaviors, it is important that a flow passing through two routers not trigger the same threat alarm more than once.  By deduplicating the alarms the administrator experiences the best of both worlds.  They will only receive a single alarm for any given threat, but will still see in that alarm all of the exporters the event was found on.

 

NetFlow and IPFIX reporting is a balance between pre-processing data and processing on the fly.  Deduplicating every flow received down to a simple tuple may have made sense in a NetFlow v5 world, but today we live in a world of flexible NetFlow and IPFIX.  That means the old simple flow tuple is a tiny part of the data we can expect to collect and report on.  Utilizing critical processing power to constantly distill data down into an obsolete construct is simply a waste of time, resources and results in lost critical information.

 

Ad-hoc Deduplication and Stitching

What is needed in the NetFlow and IPFIX collection industry is the ability to save 100% of the flow data and to report on it across all collectors. Customers need reassurance that the data was saved and that it is easily accessible - fast.  They need the ability to start high level and be able to drill in for the individual routers involved with a connection as well as obtain details on the return path regardless of which collector saved the data.  A well thought out architecture addresses all of these needs.