- Must have min 6 years of recent experience in Application Support/Technology Support / DevOps / CloudOps, and should be ready to work in a 24 X 7 support environment
- Must have managed 1 or more applications single handedly and worked as L2 / L3 support engineer for 2 to 3 years.
- Must be hands on with Unix Commands, Shell Scripting, PL/SQL, NOSQL, JCL, Programing Language: Java, Python,
- Must be hands on with observability tools like ELK, Kibana, Grafana, AppD, Splunk or any other similar tools
- Must have domain knowledge in E-commerce, Retail, Consumer Goods, Supply Chain or any equivalent domain applications that have direct customer facing web or mobile applications
- Must be hands on with analyzing logs, thread dumps, heap dumps, GCs etc.
- Working/Functional knowledge of SAP Hybris, IBM Sterling, Magento Commerce, SAP or any other E-commerce platform would be an added advantage
- ITIL foundation certifications will be added advantages
- Good understanding of microservices architecture
- Working knowledge of Dockers, Kubernetes, Cloud platforms would be added advantage
- Strong written and verbal communication skills is must
Job Role : SME Key Responsibilities
- Application Operations & Management:
- Study and perform capacity planning to ensure that adequate capacity is available in application and application as per present and future projections across all environments (Replica and Prod)
- Study Volumetrics/traffic/routing patterns and perform business KPI trending to identify abnormal patterns/deviations that may cause system issues in future. Propose and make changes towards closure.
- Perform continuous checks on E2E application w.r.t functionality, sequence flows, system load management
- Handle all escalations on issues if not resolved or partially resolved by L2
- Keep track of all existing defects in application and review the closure status with app lead/platform lead.
- Lead and participate in all Sev1/ Sev2 Issue and resolution activity by way of, 'Issue analysis, fixing and RCA Identification, 'Log extraction and sharing with the Dev/SRE teams, 'Coordinate with Dev/SRE support team for workaround/fix to resolve the Sev1/Sev2 issue, 'RCA Preparation and closure of action points closure
- Assist in timely reporting of critical issues to management
- Assist in Generating KPI reports and Business Metrics for MIS reporting
- Alert configuration and monitoring
- Identify all failure points are captured as part of monitoring and alert notifications and assist in configuration
- Perform Optimization on existing alerts based on application working
- Identify and create known gaps and track them for closure based on alerts
- Monitor the Alerts in NGO Portal on ongoing basis for any exceptions
- Assist App lead to work on alert reduction plan
- Change Management:
- Review changes and assess end to end impact and limitations that might destabilize or impact production
- Ensure changes are thoroughly tested in Replica environments and meets all the production standards
- Application Onboarding & New Projects:
- 'Participate and support Project activities (Upgrades, migration, new product implementations)
- Lead the Functional and Regression Testing activities
- Perform Performance and Stress Testing Completeness
- Learning, Training and Documentation
- Create/Change the technical documentation (runbooks, configuration , design docs) as per review cycle
- Create Standard Operating Procedures to be shared with all team members for immediate actions
- Prepare a training calendar in coordination with App Lead , Prepare the training the material and train the resources in the team for operations
- Information Security & Audit Compliance:
- Lead and address Application security concerns (InfoSec observations, BAVAMA tasks) and are actioned and closed on priority basis.
-
|