Core roles and responsibilities
- Network Monitoring
- Continuously monitor network performance, connectivity, and system operations
- Use NOC monitoring tools to identify faults, alarms, outages, and performance issues
- Incident Response & Troubleshooting
- Respond to network alerts and service interruptions in real-time
- Perform initial diagnosis, escalate to relevant teams when necessary
- Ensure timely resolution of incidents within SLA timelines
- Maintenance & Support
- Provide L1/L2 support for network devices including routers, switches, firewalls, and servers
- Perform routine health checks on infrastructure and applications
- Support configuration updates and scheduled maintenance activities
- Escalation & Coordination
- Communicate with field engineers, ISPs, data center teams, and other stakeholders
- Follow proper escalation matrix when issues require higher-level intervention
- Documentation & Reporting
- Maintain accurate incident logs and RCA (Root Cause Analysis) reports
- Update knowledge base articles and network topology documentation
- Prepare daily and weekly network performance reports
- Security & Compliance
- Monitor security alerts and unauthorized access attempts
- Ensure compliance with operational standards and cyber security policies
- Process Improvement
- Suggest enhancements to monitoring and incident response processes
- Participate in audits, drills, and continuous improvement initiatives
- Customer Communication
- Provide timely status updates to internal teams and customers
- Maintain professionalism during high-severity outages
Additional Skills Expected
- Strong understanding of networking protocols (TCP/IP, DNS, DHCP, BGP, OSPF)
- Basic Linux/Windows server administration
- Familiarity with ITSM tools (ServiceNow, JIRA, Remedy)
- Monitoring tools (SolarWinds, Nagios, Zabbix, PRTG, Datadog, etc.)
- Willingness to work in 24/7 rotational shifts