Contract type: Permanent
Location: Selangor
Salary: RM10,000 - RM15,000
Start date: 28-09-2021
Reference: PR/146076
Contact details: Daniel Ng
Contact email:
Job published: September 02, 2021 12:42
  • Run the environments by monitoring availability and taking a holistic view of system health
  • Build software ( and systems to manage platform infrastructure and applications
  • Improve reliability, quality, and time to market of our suite of software solutions
  • Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
  • Provide primary operational support and engineering for multiple large distributed software applications
  • Lead and assist in implementing SRE best practices in the team and collaborations with other teams
  • Help to train and coach new members of team who are new to the technologies and SRE practices
  • Gather, analyze metrics from systems and applications to assist in performance tuning and fault finding
  • Partner with development teams to improve services through rigorous testing and release procedures
  • Participate in system design consulting, platform management, and capacity planning
  • Create sustainable systems and services through automation and uplifts
  • Balance feature development speed and reliability with well defined service level objectives
  • Continuously improve the solution monitoring solution of the operation
  • Continuously implement any automation which improve the operation and reliability of the operation
  • Minimum degree or technical training in Computer Science, or equivalent combination of training, and/or experience
  • At least 3 years experience in software development 
  • Strong knowledge of Linux and VM.
  • Competent knowledge of at least a database
  • Ability to program with one or more high level languages, at least in Python
  • Experience with log analysis
  • Experience with root cause analysis
  • Knowledge and experience in Nagios and Splunk will be added advantage.
  • A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
  • Knowledge or experience in other related technologies (OpenShift, Kubernetes) are advantageous