Trying to solve a problem monitoring the growing number of Metro-E circuits we have. The problem with monitoring is that we connect our router interfaces to third party media converters. When there is a cut, our interfaces continue to report up/up and we never get "link down" alerts. Because we have redundant paths to many locations, traffic switches to protect and we never get alerted that our primary path is down or receive any device outages from that network.
If our engineers could devise a way to only advertise certain routes over the Metro-Es, then we would monitor an interface or device across the links. I have been unsuccessful going that route
Is there a way to use InterMapper to monitor if traffic stops for a link, or if OSPF neighbors and adjacencies are lost? I have seen requests for high utilization/threshold indicators. I'm looking for the opposite - a NO UTILIZATION indicator.
Surely there is a simple monitoring solution that does not require custom network routing or monitoring of third party networks. Any advice or suggestions are greatly appreciated.
Solution submitted by speachey
A brilliant programmer in our office came up with a solution that seems like a winner. He suggested monitoring the ARP address of the Metro-E interfaces. We use the built-in InterMapper SNMP comparison probe to perform a string comparison.
Example on our Cisco routers:
do an snmpwalk to determine ARP addresses:
snmpwalk -v 2c -c <your snmp string> your.router atIfIndex
RFC1213-MIB::atIfIndex.2.1.<metro-e interface> = INTEGER: 2
RFC1213-MIB::atIfIndex.50.1.<other interface> = INTEGER: 50
RFC1213-MIB::atIfIndex.58.1.<other interface> = INTEGER: 58
set the snmp comparison probe address as the router on the end of the circuit you want to monitor from.
variable: atNetAddress.2.1.<metro-e interface> (atNetAddress replaces atIfIndex string from snmpwalk)
value: <metro-e interface>
Legend: Metro-E Peer
We set the probe value as the IP address of the metro interface and verify with the string comparison; if the IP of the interface accross the link is removed from ARP list, the probe will alert us. The only tweak we had to make on the routers is to increase the ARP timeout on the interface from default of 4 hours to 5 minutes.
I hope this helps others with similar Metro-E woes