Site Reliability Engineer

Job Locations IN-Pune
Requisition ID
Job Category
Research and Development
Travel Requirements


Site Reliability Engineer


SAS technology allows customers to achieve great results in a wide range of diverse industries. From ensuring your bank account is not compromised to analyzing the data behind many goods, products and pharmaceuticals, or helping with natural disasters, SAS is driving a revolution in the way big data is used every day! #Data4Good.


SAS R&D Pune, with its experience in developing applications and solutions for a variety of domains, has built a reputation within SAS for excellence in the development and delivery of high-quality applications across an expanding set of technologies. We aim to push our developers’ careers in the direction they have always envisaged.


Welcome to the forefront of making a difference!



About the role:


The managed-tenants’ Site Reliability Engineer (SRE) role is based in the Pune R&D Center, at SAS R&D Pune facility. For this role, you will be a part of Retail Enablement Team. In this role, you will help architect, modify, improve, and support the platform running user-facing Software as a Service (SaaS) and Managed service offerings of SAS Retail solution running on Microsoft Azure. Using your expertise in site reliability engineering principles of automation and continuous improvement, you will help create an environment where availability, reliability, and security are threaded through the entire application life cycle, not treated as an afterthought. As a Site Reliability Engineer, you will write new software as required to automate the building, testing, deployment, promotion, monitoring, alerting, and maintenance of managed SAS software.

SRE is accountable for the overall reliability of services running in our cloud production environments. As an SRE, you would be engaging with the software engineering team and infrastructure team at every level to directly embed or tag them for resolving production challenges. Furthermore, you will be joining a friendly team with a broad range of experience.


Essential Technology Experience:

  • 5 years of related experience with a technical Bachelor’s degree; or equivalent practical experience
  • Experience with Python, Gitlab, OpenAPI and DevOps processes
  • Experience with development and deployment in a hosted cloud environment, preferably Microsoft Azure
  • Experience with distributed cloud service development, infrastructure, traffic management and architecture
  • Experience with Kubernetes environments and understanding of multi-tenancy and security implications
  • Experience working with container deployment and orchestration technologies with knowledge of fundamentals including service discovery, deployments, monitoring, scheduling, load balancing.
  • Passion for automating common tasks and processes and dedicated to writing well tested and maintainable code
  • You enjoy designing, developing, and maintaining distributed systems at scale in production
  • Knowledge of standard methodologies related to security, performance, and disaster recovery
  • Skilled in identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues


Experience of these items would be useful:

  • Experience writing code in Java/GoLang
  • Experience with Agile software development methodologies
  • Experience developing a Kubernetes controller, operator, or platform component
  • Operations experience with a production user-facing application
  • Experience with thread dump and heap dump analysis, understanding of performance engineering concepts and tools like JMeter, JProfiler, Dyna Trace, etc. would be a huge plus.

We are a friendly team, and we’ll be offering you plenty of opportunities to develop your career. Interested? Then please get in touch to find out more! 


Primary Responsibilities

  • Collaborating closely with product stakeholders and engineering teams to deliver and scale secure, stable, observable applications
  • Proactively writing monitors and alerts to ensure actionable data insights are provided to application and infrastructure teams
  • Documenting to ensure the utility and maintainability of the platform and shared services
  • Providing constructive feedback on your colleagues’ pull requests, and accepting constructive feedback on your own pull requests in return



Background and Experience

SAS is seeking a Site Reliability Engineer with an experience of 4-5 years and above in Retail software domain. The candidate should have successful track record of working hosting organizations preferably developing infrastructure and software automation for global teams. Candidate should have excellent verbal and written communication skills and ability to address clearly and kindly with technical and non-technical colleagues.




Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
Share on your newsfeed