« Back

New to an SRE team?.

New to an SRE team?

If you are new to the SRE team or the first SRE in your company, this blog post may be helpful. This blog post will touch on ideas / questions / thoughts you should be having when joining a new team and company. Whether you are joining an established team or are the first SRE(ops / techops / devops) this is a nice starting point. I have been in your shoes before; whether its "Numero Uno" or joining an established team. These are some of the ways I have always approached being in this situation and they have always helped me in getting a lay of the land and understanding where we are and where we want to go as a company.

I chose to break these down into a few different phases. You can mix and match as you like but just understand that at some point in time you will have to become familiar with all of these aspects. If you do them right off the bat it will help you be successful in the new company you have just joined. This list is high level and we could drill down into each of these subtopics. The idea here is start in these areas and go as deep as your curiosity will take you. However don't fall too far down the rabbit hole... you may just get lost.

Phase I - Meet and Greet

  • Talk with as many people as you can and gain their perspective of the current state of the system.
  • Ask as many questions as you can possibly think of around what each team is trying to accomplish and their ideal world.
  • Find out all pain points people are having with the current system.
  • Ask about how many outages are normal for the current system.
  • Find out current relationship between teams and collaboration status.
  • Ask each team if they could improve 1 thing what would it be?
  • Find the longest running team members and ask if they can outline of all the skeletons and which closets they live in.
  • TAKE NOTES!!!!!!!!! I cannot stress this enough. I get that you are smart but writing things down is a key item to success.
  • What is expected of you and your team as well as what is being supported?

Phase II - Physical / Virtual Infrastructure

  • Infrastructure Layout
    • Cloud or Physical or Hybrid?
    • Colo or closet?
    • Hardware Type & Vendors(chassis, cpu, memory, disk, network, etc)
  • Operation System blueprint
    • OS Distros & Versions
    • Basic or custom drivers?
    • OS/Kernel Tunes
    • Packages
    • Custom compiled libraries(if any)
    • Custom tweaks to OS(if any)
  • Infrastructure Services
    • Current state of documentation
    • Monitoring System
    • Configuration Management if any
    • Common Infrastructure Tooling(DNS,DHCP,PXE,LDAP,etc)
    • Infrastructure Tooling(custom services & tools)
    • Github/Gitlab/SVN workflow(teams repos, CI/CD for infra, etc)
    • Deployment mechanism
    • External tool dependency
    • Databases / Queues / Services that are used by application teams

Phase III - Application / Services

  • Application / Service current state
    • Number of applications(how many different apps)
    • Types of applications(what are these apps doing)
    • Languages in environment
    • Dependencies for stack
    • Current state of CI/CD
    • Current state of documentation
    • Monitoring around application
    • Common issues in application environment
    • Oncall for outages and Escalation policies(if any)
  • Application / Service future wants
    • Expectations of the SRE team and organization
    • New technologies that are desired
    • Improvements that can be made
    • Architecture changes

Phase IV - Start Work

  • Improve/Implement current state & architecture of infrastructure
  • Improve/Implement process around environment deployment(physical or virtual)
  • Improve/Implement tools an services in environment
  • Improve/Implement config management
  • Improve/Implement best practices around software deployment
  • Improve/Implement relationships around outside teams(technical & non-technical)
  • Improve/Implement as much as you can possibly handle and all areas you think will be beneficial to the companies mission

    The final phase is always the most fun. This is where you have gathered as much information as you possibly could and now its time to bring your ideas around this environment to life. Now not everyone will be onboard with everything you suggest or want to do but that should be expected. We have a responsibility not only as team mates but as human beings to remember that everything was done some way for a reason. Whether it makes sense to you or not it was someone else's thought process that brought them to the point of doing it that certain way. By no means should you ridicule or belittle the decision that was made. It was likely good enough for that point in time and now its your turn to offer improvement suggestions based off of your past experience and knowledge. You are being brought in because these same group of peers thought you could be a value to their mission. I have seen time and time again people become instantly disregarded because they complained before actually understanding why. This is my attempt to make sure it doesn't happen to you and we all work together to build a better culture in technology.

comments powered by Disqus