DevOps/SRE Engineer

Back

Apply today

At AID:Tech, we believe that the future of finance is borderless, accessible, private and personalised. In line with that vision, our mission is to reduce inequality and increase opportunity by making identity and payment services seamless and accessible to all.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

About the Role

As general rule of thumb, all candidates (regardless of the position) most observe the following:

- Passion and curiosity for creating and building things. You enjoy learning new things and taking on new challenges

- Problem solver. You have the habit and skills required to bring structure and clarity to potentially ambiguous or complicated scenarios

- Proactive attitude. You are awesome and know that it is always better to propose solutions when discussing any issue

- Independent; with experience and taste for working remotely. You are familiar with the tools and rituals of the trade, and comfortable working on your own and effectively communicating asynchronously with your teammates

- Generous with your knowledge and experience. You enjoy putting on your mentor hat whenever appropriate to share your knowledge with the rest of your teammates

- Fluent in English, written and spoken. We are a multinational and multicultural team after all

- The successful candidate will be responsible for successfully designing, deploying and managing our infrastructure based on the technical requirements of our team and clients.

- This is a broad-discipline system administration role with requirements to manage significant technical diversity.

Responsibilities:

- Ensuring data recoverability by developing and implementing a schedule of system backups and database archive operations

- Ensuring system data integrity by evaluating, implementing, and managing appropriate software and hardware solutions of varying complexities

- Conducting hardware and software audits of workstations and remote servers to ensure compliance with established standards, policies, and configuration guidelines

- Subscription management, onboarding, coordination, billing management and education of fellow team members will be necessary

- Work directly with architects and developers to debug problems and solutions.

- Enforcing best practices with external Vendors- relationship and negotiating skills will be used often

- Developing, implementing, enforcing standard operating procedures and schedules will help scale the entire engineering team

- Provide technical support to team members; setting up accounts using onboarding automation for new team members

- Lots of experience effectively and securely setting up and managing Cloud Computing environments with several providers

- You’re comfortable designing and deploying a K8s cluster with an HA control-plane either from scratch or using some of the managed services available

- You understand the implications, benefits and trade-offs of using nginx or HAProxy as an ingress controller, and are able to clearly identify and communicate the best option for a given scenario taking into account any relevant constraints and use-cases

- You understand the differences and trade-offs of utilizing an ALB (layer 7) or NLB (layer 4) load balancer solution; have practical experience with both and are able to clearly identify and communicate the best option for a given scenario taking into account any relevant constraints and use-cases

- You are familiar with basic and intermediate network security concepts and are able to set up and deploy the required tools and services in order to provide a secure and reliable environment for our applications.

- You have a lot of practical experience working with monitoring and telemetry solutions; including the relevant specifications and standards like OpenCensus, OpenTracing and OpenTelemetry. You could spend hours explaining the differences between the different pillars of observability (i.e., logs, metrics and distributed traces) and how to properly complement each other

- You are able to design and deploy a complete end-to-end telemetry data ingestion pipeline; either from scratch using open-source components or recommending and setting up commercial offerings. By end-to-end we mean, from the setting up and deploying data collector agents to the dashboards and automated alerting components required to measure and monitor the SLAs and SLOs established on the business side

- You’ll be able to assist our development team on setting up CI/CD workflows to automate common operations, from running QA tasks to artifact management

- When faced with the question: “Should we use CockroachDB or ScyllaDB for X?” You should be able to: analyze the business needs, run any required benchmarks, produce a report documenting your findings, and finally recommend a solution

Qualifications:

- 5+ years of previous experience as a Devops Engineers, Systems Administrator or related role

- Strong background in deployment automation/configuration management of large scale distributed systems

- Experience with Cloud Compute Providers (AWS preferred but DigitalOcean, Google Cloud and Azure also considered).

- Linux SysAdmin expertise

- Experience with Docker orchestration

- Significant experience designing tools for infrastructure management

- Experience doing root cause analysis on distributed systems

- Outstanding written and verbal communication skills, being able to explain complex technical concepts clearly and succinctly

- Team player- able to function cohesively within a globally distributed team dynamic

- Bachelor’s degree in Computer Science, Systems Engineering or related IT (or an additional 3 years of experience)

Bonus points for:

- Previous experience with blockchain/DLT technologies

- Understanding of financial securities and the various participants in the securities ecosystem

- Start-up experience, including remote teams

- Knowledgeable regarding security best practices

- Currently manages at a personal level cryptocurrencies and private keys