This Position is All About
At HBC, we are undergoing a major technology
transformation to embrace the latest technology changes in the industry. As
part of that, we are migrating our critical infrastructure and application
platforms to public clouds like Azure/AWS. The Reliability Engineer is an
experienced professional and an energetic individual who will help successfully
drive improvements of the monitoring, logging & telemetry capabilities for
the application and infrastructure components.
As The Reliability
Engineer, You Will:
- Improve reliability of the platform
by developing and enhancing monitoring & automation, including self-healing
- Create procedures/run books for
operational aspects of monitoring, logging & telemetry
- Work with admins and platform
engineers through implementation decisions to achieve highly reliable
- Provide business and engineering
support services for both in-house and external customers
- Resolve system and performance
issues by careful analysis, debugging, identifying the root cause and applying
Document all changes following
controls, procedures and documentation standards and raises issues and concerns
with recommendations for follow-up action
You Also Have:
- Bachelor's Degree in Computer
Science or equivalent
- Azure/AWS, Microsoft, RedHat,
certifications and knowledge of ITIL/MOF practices
- Experience with monitoring, logging
& telemetry tools like Splunk, ELK, Nagios, SolarWinds, Prometheus,
AWS Cloudwatch/Cloud Metrics, Datadog, etc.
- Understanding and hands on
experience in orchestration frameworks like Kubernetes.
- Experience with automation and tools
such as (but not limited to) Jenkins, Chef, Terraform, Ansible, etc.
- Expertise in creating and maintaining
Automation (PowerShell, Python, Ruby, AWK, SED, etc.) to run health-checks and
self-healing capabilities for the platforms.
- Very good experience with the
Solutions, including Windows Server Operating Systems
fundamentals, TCP/IP, DNS, WINS, DHCP, etc.
with Git branching strategies, applying CI-CD principles to infrastructure
build and deployment processes
with ticketing tools like Jira, Cherwell, etc.
- Hands on
experience with the Linux Server solutions, including: RPM, Puppet, Satellite,
Working knowledge of databases (Oracle, MS SQL,
Teradata, DB2, etc.)
- 2-4 years of experience working in global organizations with the ability to communicate with senior
- 2-4 years of experience working on monitoring
& logging tools, self-healing solutions and platform automation
How Often You May
Your Life and Career
- Be part of a world-class team; work with an adventurous spirit;
think and act like an owner- operator!
- Exposure to rewarding career advancement opportunities, from retail
to supply chain, to digital or corporate.
- A culture that promotes a healthy, fulfilling work/life balance.
- Benefits package for all eligible full-time employees (including
medical, vision and dental).
- An amazing employee discount.
Thank you for your interest with HBC. We look forward to reviewing your application.
HBC provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability or genetics. In addition to federal law requirements, HBC complies with applicable state and local laws governing nondiscrimination in employment in every location in which the company has facilities. This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation and training.
HBC welcomes all applicants for this position. Should you be individually selected to participate in an assessment or selection process, accommodations are available upon request in relation to the materials or processes to be used.