In this role, you’ll:
- Administer HPC clusters for deep learning workloads (Slurm, Ceph, etc.)
- Administer application servers: web services (including Nextcloud, Fogejo, OpenProject, WordPress, ERP systems, etc.), mail, XMPP, SIP, LDAP
- Administer workstations using Ansible
We require you to have:
- Strong experience in Linux system administration (Ubuntu/Debian-based preferred)
- Solid knowledge of HPC environments and workload schedulers (e.g., Slurm)
- Experience with distributed storage systems (e.g., Ceph or similar)
- Familiarity with automation and configuration management tools (e.g., Ansible)
- Experience administering network services (LDAP, mail servers, web services, etc.)
- Good understanding of networking fundamentals and security best practices
It’s a plus if you have:
- Experience with C++, CUDA, Python or Rust
- Knowledge of Linux kernel or low-level hardware programming
- Web development experience