• Principal Architect

    Location US-TX-Austin
    Job ID
    # Positions
    Position Type
    Business Group
    SolarWinds Cloud
  • Overview

    SolarWinds Cloud is developing a monitoring solution incorporating best-of-breed products: Papertrail (hosted logs), Librato (time-series metrics) and Tracelytics (APM and distributed tracing).

    We are currently hiring Senior Data Engineers who enjoy working on large-scale distributed systems problems to build a metrics and monitoring solution used by thousands of customers. 

    We’re a small team so everyone has the opportunity to have a big impact. We’ve built our platform out largely on Java 8 Dropwizard services, a handful of Golang services and some C++ where performance is critical. We leverage Kafka as our main service bus and combine it with our in-house stream processing framework (SuperChief) for real-time processing of millions of messages per second and tens of terabytes of logs per day. We store hundreds of terabytes of metrics and logs in Cassandra and MySQL clusters. We are big fans of Zookeeper for lightweight intra/inter-service coordination.

    All team members, whether local or remote, commit code to GitHub, communicate over Slack and Google Hangouts, push code to production via our ChatOps bot, and run all production applications on AWS. We also use an array of best-of-breed SaaS applications to get code to production quickly and reliably. We are a team that is committed to a healthy work/life balance.

    Papertrail/Librato/Tracelytics are wholly owned by SolarWinds Inc. so you get the benefits of a small startup with the backing of a big company, so there is no worry about the next round of funding. SolarWinds offers competitive bonus and matching 401k programs that create an attractive total compensation package.


    • Be a crucial contributor to the Solarwinds Cloud backend architecture
    • Build distributed systems using Java 8, C++, Go, and Ruby
    • Help drive the next generation of monitoring tools for cloud applications
    • Work with massive datasets in a real-time distributed system
    • Continually improve availability, scalability, performance and automation of our services
    • Explore and evaluate cutting-edge distributed systems technologies and practices
    • Come up with creative solutions to solve tough scalability and performance problems
    • Work with a distributed team of engineers across all layers of the product
    • Architect applications that leverage the latest capabilities provided by cloud technologies


    The right candidate is adept at building scalable and highly-available systems in modern system languages. You are religious in using metrics to reason about the characteristics of an application, client library, or data store and use them to drive your decisions when shipping to production. You are a developer who appreciates well-written code and cares about the impact of your design decisions on the user experience.


    • 4+ years of distributed systems experience with Java, Go or C++
    • Comfortable with using and reasoning about concurrency primitives
    • Passion for exploring emerging frameworks, libraries, technology stacks
    • Experience with ZooKeeper, Dropwizard, Kafka, or Cassandra
    • Understand the importance of metric instrumentation
    • Experience with building and consuming REST APIs
    • Experience with highly-available (NoSQL) data stores
    • Comfortable debugging network, disk, performance bugs in complex distributed systems
    • Experience developing in Linux environments
    • 6+ years of relevant engineering experience
    • Git and Maven savvy
    • Comfortable with cloud-based deployment and remote teams


    Extra Credit:


    • On-call experience fire-fighting applications in production
    • Able to write applications that use SQL databases
    • Experience working with a remote team
    • Experience with AWS cloud
    • Have built stream-processing applications using frameworks like Heron/Storm/Samza
    • Have worked with large time-series datasets

    What's in it for you?


    We offer great compensation packages and the opportunity to solve challenging problems with skilled colleagues. Our distributed team uses best practices to maximize our development velocity, including but not limited to: ChatOps, continuous integration/deployment, code review via GitHub pull requests, preferring asynchronous communication to meetings. We have competitive compensation and benefits, a team committed to life/work balance (really: http://blog.librato.com/posts/black-friday), hackday events (http://blog.librato.com/posts/hack-day) and fewer meetings, more shipping!


    Technology Showcase


    Here are just a few highlights of the systems we’ve built or work with on a regular basis:



    Apply/Socialize Options

    Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
    Share on your newsfeed

    Connect With Us!

    Not ready to apply? Connect with us for general consideration.