Alternatives to AWS X-Ray logo

Alternatives to AWS X-Ray

New Relic, Dynatrace, AppDynamics, Jaeger, and Splunk are the most popular alternatives and competitors to AWS X-Ray.
65
130
+ 1
0

What is AWS X-Ray and what are its top alternatives?

It helps developers analyze and debug production, distributed applications, such as those built using a microservices architecture. With this, you can understand how your application and its underlying services are performing to identify and troubleshoot the root cause of performance issues and errors. It provides an end-to-end view of requests as they travel through your application, and shows a map of your application’s underlying components.
AWS X-Ray is a tool in the Performance Monitoring category of a tech stack.

Top Alternatives to AWS X-Ray

  • New Relic
    New Relic

    The world’s best software and DevOps teams rely on New Relic to move faster, make better decisions and create best-in-class digital experiences. If you run software, you need to run New Relic. More than 50% of the Fortune 100 do too. ...

  • Dynatrace
    Dynatrace

    It is an AI-powered, full stack, automated performance management solution. It provides user experience analysis that identifies and resolves application performance issues faster than ever before. ...

  • AppDynamics
    AppDynamics

    AppDynamics develops application performance management (APM) solutions that deliver problem resolution for highly distributed applications through transaction flow monitoring and deep diagnostics. ...

  • Jaeger
    Jaeger

    Jaeger, a Distributed Tracing System

  • Splunk
    Splunk

    It provides the leading platform for Operational Intelligence. Customers use it to search, monitor, analyze and visualize machine data. ...

  • ELK
    ELK

    It is the acronym for three open source projects: Elasticsearch, Logstash, and Kibana. Elasticsearch is a search and analytics engine. Logstash is a server‑side data processing pipeline that ingests data from multiple sources simultaneously, transforms it, and then sends it to a "stash" like Elasticsearch. Kibana lets users visualize data with charts and graphs in Elasticsearch. ...

  • JavaScript
    JavaScript

    JavaScript is most known as the scripting language for Web pages, but used in many non-browser environments as well such as node.js or Apache CouchDB. It is a prototype-based, multi-paradigm scripting language that is dynamic,and supports object-oriented, imperative, and functional programming styles. ...

  • Git
    Git

    Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency. ...

AWS X-Ray alternatives & related posts

New Relic logo

New Relic

20.7K
8.5K
1.9K
New Relic is the industry’s largest and most comprehensive cloud-based observability platform.
20.7K
8.5K
+ 1
1.9K
PROS OF NEW RELIC
  • 415
    Easy setup
  • 344
    Really powerful
  • 244
    Awesome visualization
  • 194
    Ease of use
  • 151
    Great ui
  • 107
    Free tier
  • 80
    Great tool for insights
  • 66
    Heroku Integration
  • 55
    Market leader
  • 49
    Peace of mind
  • 21
    Push notifications
  • 20
    Email notifications
  • 17
    Heroku Add-on
  • 16
    Error Detection and Alerting
  • 13
    Multiple language support
  • 11
    Server Resources Monitoring
  • 11
    SQL Analysis
  • 9
    Transaction Tracing
  • 8
    Azure Add-on
  • 8
    Apdex Scores
  • 7
    Detailed reports
  • 7
    Analysis of CPU, Disk, Memory, and Network
  • 6
    Application Response Times
  • 6
    Performance of External Services
  • 6
    Application Availability Monitoring and Alerting
  • 6
    Error Analysis
  • 5
    JVM Performance Analyzer (Java)
  • 5
    Most Time Consuming Transactions
  • 4
    Top Database Operations
  • 4
    Easy to use
  • 4
    Browser Transaction Tracing
  • 3
    Application Map
  • 3
    Weekly Performance Email
  • 3
    Custom Dashboards
  • 3
    Pagoda Box integration
  • 2
    App Speed Index
  • 2
    Easy to setup
  • 2
    Background Jobs Transaction Analysis
  • 1
    Time Comparisons
  • 1
    Access to Performance Data API
  • 1
    Super Expensive
  • 1
    Team Collaboration Tools
  • 1
    Metric Data Retention
  • 1
    Metric Data Resolution
  • 1
    Worst Transactions by User Dissatisfaction
  • 1
    Real User Monitoring Overview
  • 1
    Real User Monitoring Analysis and Breakdown
  • 1
    Free
  • 1
    Best of the best, what more can you ask for
  • 1
    Best monitoring on the market
  • 1
    Rails integration
  • 1
    Incident Detection and Alerting
  • 0
    Cost
  • 0
    Exceptions
  • 0
    Price
  • 0
    Proce
CONS OF NEW RELIC
  • 20
    Pricing model doesn't suit microservices
  • 10
    UI isn't great
  • 7
    Expensive
  • 7
    Visualizations aren't very helpful
  • 5
    Hard to understand why things in your app are breaking

related New Relic posts

Cooper Marcus
Director of Ecosystem at Kong Inc. · | 17 upvotes · 110.5K views
Shared insights
on
New RelicNew RelicGitHubGitHubZapierZapier
at

I've used more and more of New Relic Insights here in my work at Kong. New Relic Insights is a "time series event database as a service" with a super-easy API for inserting custom events, and a flexible query language for building visualization widgets and dashboards.

I'm a big fan of New Relic Insights when I have data I know I need to analyze, but perhaps I'm not exactly sure how I want to analyze it in the future. For example, at Kong we recently wanted to get some understanding of our open source community's activity on our GitHub repos. I was able to quickly configure GitHub to send webhooks to Zapier , which in turn posted the JSON to New Relic Insights.

Insights is schema-less and configuration-less - just start posting JSON key value pairs, then start querying your data.

Within minutes, data was flowing from GitHub to Insights, and I was building widgets on my Insights dashboard to help my colleagues visualize the activity of our open source community.

#GitHubAnalytics #OpenSourceCommunityAnalytics #CommunityAnalytics #RepoAnalytics

See more
Julien DeFrance
Principal Software Engineer at Tophatter · | 16 upvotes · 3.1M views

Back in 2014, I was given an opportunity to re-architect SmartZip Analytics platform, and flagship product: SmartTargeting. This is a SaaS software helping real estate professionals keeping up with their prospects and leads in a given neighborhood/territory, finding out (thanks to predictive analytics) who's the most likely to list/sell their home, and running cross-channel marketing automation against them: direct mail, online ads, email... The company also does provide Data APIs to Enterprise customers.

I had inherited years and years of technical debt and I knew things had to change radically. The first enabler to this was to make use of the cloud and go with AWS, so we would stop re-inventing the wheel, and build around managed/scalable services.

For the SaaS product, we kept on working with Rails as this was what my team had the most knowledge in. We've however broken up the monolith and decoupled the front-end application from the backend thanks to the use of Rails API so we'd get independently scalable micro-services from now on.

Our various applications could now be deployed using AWS Elastic Beanstalk so we wouldn't waste any more efforts writing time-consuming Capistrano deployment scripts for instance. Combined with Docker so our application would run within its own container, independently from the underlying host configuration.

Storage-wise, we went with Amazon S3 and ditched any pre-existing local or network storage people used to deal with in our legacy systems. On the database side: Amazon RDS / MySQL initially. Ultimately migrated to Amazon RDS for Aurora / MySQL when it got released. Once again, here you need a managed service your cloud provider handles for you.

Future improvements / technology decisions included:

Caching: Amazon ElastiCache / Memcached CDN: Amazon CloudFront Systems Integration: Segment / Zapier Data-warehousing: Amazon Redshift BI: Amazon Quicksight / Superset Search: Elasticsearch / Amazon Elasticsearch Service / Algolia Monitoring: New Relic

As our usage grows, patterns changed, and/or our business needs evolved, my role as Engineering Manager then Director of Engineering was also to ensure my team kept on learning and innovating, while delivering on business value.

One of these innovations was to get ourselves into Serverless : Adopting AWS Lambda was a big step forward. At the time, only available for Node.js (Not Ruby ) but a great way to handle cost efficiency, unpredictable traffic, sudden bursts of traffic... Ultimately you want the whole chain of services involved in a call to be serverless, and that's when we've started leveraging Amazon DynamoDB on these projects so they'd be fully scalable.

See more
Dynatrace logo

Dynatrace

323
339
28
Monitor, optimize, and scale every app, in any cloud
323
339
+ 1
28
PROS OF DYNATRACE
  • 4
    Real User Monitoring
  • 4
    Automated RCA
  • 3
    Out-of-the-box distributed transaction tracing
  • 2
    Built on massive industry expertise (since 2005)
  • 2
    AI-powered platform
  • 2
    Extensible via SDK
  • 1
    Digital Experience
  • 1
    Easy setup
  • 1
    Accelerate software delivery
  • 1
    Infrastructure Monitoring
  • 1
    Applications & Microservices
  • 1
    Application Security
  • 1
    Built on API-first design principles
  • 1
    Automatic instrumentathird generation full stack Agents
  • 1
    Analytics vMotion events detection Discovery Performanc
  • 1
    Automation
  • 1
    Business Analytics
CONS OF DYNATRACE
  • 0
    Application Security
  • 0
    Real User Monitoring
  • 0
    Infrastructure Monitoring
  • 0
    Applications & Microservices
  • 0
    AI-powered platform

related Dynatrace posts

Farzeem Diamond Jiwani
Software Engineer at IVP · | 8 upvotes · 1.4M views

Hey there! We are looking at Datadog, Dynatrace, AppDynamics, and New Relic as options for our web application monitoring.

Current Environment: .NET Core Web app hosted on Microsoft IIS

Future Environment: Web app will be hosted on Microsoft Azure

Tech Stacks: IIS, RabbitMQ, Redis, Microsoft SQL Server

Requirement: Infra Monitoring, APM, Real - User Monitoring (User activity monitoring i.e., time spent on a page, most active page, etc.), Service Tracing, Root Cause Analysis, and Centralized Log Management.

Please advise on the above. Thanks!

See more

Hi Folks,

I am trying to evaluate Site24x7 against AppDynamics, Dynatrace, and New Relic. Has anyone used Site24X7? If so, what are your opinions on the tool? I know that the license costs are very low compared to other tools in the market. Other than that, are there any major issues anyone has encountered using the tool itself?

See more
AppDynamics logo

AppDynamics

304
618
68
Application management for the cloud generation
304
618
+ 1
68
PROS OF APPDYNAMICS
  • 21
    Deep code visibility
  • 13
    Powerful
  • 8
    Real-Time Visibility
  • 7
    Great visualization
  • 6
    Easy Setup
  • 6
    Comprehensive Coverage of Programming Languages
  • 4
    Deep DB Troubleshooting
  • 3
    Excellent Customer Support
CONS OF APPDYNAMICS
  • 5
    Expensive
  • 2
    Poor to non-existent integration with aws services

related AppDynamics posts

Farzeem Diamond Jiwani
Software Engineer at IVP · | 8 upvotes · 1.4M views

Hey there! We are looking at Datadog, Dynatrace, AppDynamics, and New Relic as options for our web application monitoring.

Current Environment: .NET Core Web app hosted on Microsoft IIS

Future Environment: Web app will be hosted on Microsoft Azure

Tech Stacks: IIS, RabbitMQ, Redis, Microsoft SQL Server

Requirement: Infra Monitoring, APM, Real - User Monitoring (User activity monitoring i.e., time spent on a page, most active page, etc.), Service Tracing, Root Cause Analysis, and Centralized Log Management.

Please advise on the above. Thanks!

See more

We are evaluating an APM tool and would like to select between AppDynamics or Datadog. Our applications are largely hosted on Microsoft Azure but we would keep the option to move to AWS or Google Cloud Platform in the future.

In addition to core Azure services, we will be hosting other components - including MongoDB, Keycloak, PagerDuty, etc. Our applications are largely C# and React-based using frontend for Backend patterns and Azure API gateway. In addition, there are close to 50+ external services integrated using both REST and SOAP.

See more
Jaeger logo

Jaeger

330
455
21
Distributed tracing system released as open source by Uber
330
455
+ 1
21
PROS OF JAEGER
  • 6
    Easy to install
  • 6
    Open Source
  • 5
    Feature Rich UI
  • 4
    CNCF Project
CONS OF JAEGER
    Be the first to leave a con

    related Jaeger posts

    Conor Myhrvold
    Tech Brand Mgr, Office of CTO at Uber · | 44 upvotes · 10M views

    How Uber developed the open source, end-to-end distributed tracing Jaeger , now a CNCF project:

    Distributed tracing is quickly becoming a must-have component in the tools that organizations use to monitor their complex, microservice-based architectures. At Uber, our open source distributed tracing system Jaeger saw large-scale internal adoption throughout 2016, integrated into hundreds of microservices and now recording thousands of traces every second.

    Here is the story of how we got here, from investigating off-the-shelf solutions like Zipkin, to why we switched from pull to push architecture, and how distributed tracing will continue to evolve:

    https://eng.uber.com/distributed-tracing/

    (GitHub Pages : https://www.jaegertracing.io/, GitHub: https://github.com/jaegertracing/jaeger)

    Bindings/Operator: Python Java Node.js Go C++ Kubernetes JavaScript OpenShift C# Apache Spark

    See more
    Splunk logo

    Splunk

    598
    1K
    20
    Search, monitor, analyze and visualize machine data
    598
    1K
    + 1
    20
    PROS OF SPLUNK
    • 3
      API for searching logs, running reports
    • 3
      Alert system based on custom query results
    • 2
      Dashboarding on any log contents
    • 2
      Custom log parsing as well as automatic parsing
    • 2
      Ability to style search results into reports
    • 2
      Query engine supports joining, aggregation, stats, etc
    • 2
      Splunk language supports string, date manip, math, etc
    • 2
      Rich GUI for searching live logs
    • 1
      Query any log as key-value pairs
    • 1
      Granular scheduling and time window support
    CONS OF SPLUNK
    • 1
      Splunk query language rich so lots to learn

    related Splunk posts

    Shared insights
    on
    SplunkSplunkDjangoDjango

    I am designing a Django application for my organization which will be used as an internal tool. The infra team said that I will not be having SSH access to the production server and I will have to log all my backend application messages to Splunk. I have no knowledge of Splunk so the following are the approaches I am considering: Approach 1: Create an hourly cron job that uploads the server log file to some Splunk storage for later analysis. - Is this possible? Approach 2: Is it possible just to stream the logs to some splunk endpoint? (If yes, I feel network usage and communication overhead will be a pain-point for my application)

    Is there any better or standard approach? Thanks in advance.

    See more
    Shared insights
    on
    KibanaKibanaSplunkSplunkGrafanaGrafana

    I use Kibana because it ships with the ELK stack. I don't find it as powerful as Splunk however it is light years above grepping through log files. We previously used Grafana but found it to be annoying to maintain a separate tool outside of the ELK stack. We were able to get everything we needed from Kibana.

    See more
    ELK logo

    ELK

    841
    925
    21
    The acronym for three open source projects: Elasticsearch, Logstash, and Kibana
    841
    925
    + 1
    21
    PROS OF ELK
    • 13
      Open source
    • 3
      Can run locally
    • 3
      Good for startups with monetary limitations
    • 1
      External Network Goes Down You Aren't Without Logging
    • 1
      Easy to setup
    • 0
      Json log supprt
    • 0
      Live logging
    CONS OF ELK
    • 5
      Elastic Search is a resource hog
    • 3
      Logstash configuration is a pain
    • 1
      Bad for startups with personal limitations

    related ELK posts

    Wallace Alves
    Cyber Security Analyst · | 2 upvotes · 859.7K views

    Docker Docker Compose Portainer ELK Elasticsearch Kibana Logstash nginx

    See more
    JavaScript logo

    JavaScript

    350.6K
    266.9K
    8.1K
    Lightweight, interpreted, object-oriented language with first-class functions
    350.6K
    266.9K
    + 1
    8.1K
    PROS OF JAVASCRIPT
    • 1.7K
      Can be used on frontend/backend
    • 1.5K
      It's everywhere
    • 1.2K
      Lots of great frameworks
    • 896
      Fast
    • 745
      Light weight
    • 425
      Flexible
    • 392
      You can't get a device today that doesn't run js
    • 286
      Non-blocking i/o
    • 236
      Ubiquitousness
    • 191
      Expressive
    • 55
      Extended functionality to web pages
    • 49
      Relatively easy language
    • 46
      Executed on the client side
    • 30
      Relatively fast to the end user
    • 25
      Pure Javascript
    • 21
      Functional programming
    • 15
      Async
    • 13
      Full-stack
    • 12
      Setup is easy
    • 12
      Its everywhere
    • 11
      JavaScript is the New PHP
    • 11
      Because I love functions
    • 10
      Like it or not, JS is part of the web standard
    • 9
      Can be used in backend, frontend and DB
    • 9
      Expansive community
    • 9
      Future Language of The Web
    • 9
      Easy
    • 8
      No need to use PHP
    • 8
      For the good parts
    • 8
      Can be used both as frontend and backend as well
    • 8
      Everyone use it
    • 8
      Most Popular Language in the World
    • 8
      Easy to hire developers
    • 7
      Love-hate relationship
    • 7
      Powerful
    • 7
      Photoshop has 3 JS runtimes built in
    • 7
      Evolution of C
    • 7
      Popularized Class-Less Architecture & Lambdas
    • 7
      Agile, packages simple to use
    • 7
      Supports lambdas and closures
    • 6
      1.6K Can be used on frontend/backend
    • 6
      It's fun
    • 6
      Hard not to use
    • 6
      Nice
    • 6
      Client side JS uses the visitors CPU to save Server Res
    • 6
      Versitile
    • 6
      It let's me use Babel & Typescript
    • 6
      Easy to make something
    • 6
      Its fun and fast
    • 6
      Can be used on frontend/backend/Mobile/create PRO Ui
    • 5
      Function expressions are useful for callbacks
    • 5
      What to add
    • 5
      Client processing
    • 5
      Everywhere
    • 5
      Scope manipulation
    • 5
      Stockholm Syndrome
    • 5
      Promise relationship
    • 5
      Clojurescript
    • 4
      Because it is so simple and lightweight
    • 4
      Only Programming language on browser
    • 1
      Hard to learn
    • 1
      Test
    • 1
      Test2
    • 1
      Easy to understand
    • 1
      Not the best
    • 1
      Easy to learn
    • 1
      Subskill #4
    • 0
      Hard 彤
    CONS OF JAVASCRIPT
    • 22
      A constant moving target, too much churn
    • 20
      Horribly inconsistent
    • 15
      Javascript is the New PHP
    • 9
      No ability to monitor memory utilitization
    • 8
      Shows Zero output in case of ANY error
    • 7
      Thinks strange results are better than errors
    • 6
      Can be ugly
    • 3
      No GitHub
    • 2
      Slow

    related JavaScript posts

    Zach Holman

    Oof. I have truly hated JavaScript for a long time. Like, for over twenty years now. Like, since the Clinton administration. It's always been a nightmare to deal with all of the aspects of that silly language.

    But wowza, things have changed. Tooling is just way, way better. I'm primarily web-oriented, and using React and Apollo together the past few years really opened my eyes to building rich apps. And I deeply apologize for using the phrase rich apps; I don't think I've ever said such Enterprisey words before.

    But yeah, things are different now. I still love Rails, and still use it for a lot of apps I build. But it's that silly rich apps phrase that's the problem. Users have way more comprehensive expectations than they did even five years ago, and the JS community does a good job at building tools and tech that tackle the problems of making heavy, complicated UI and frontend work.

    Obviously there's a lot of things happening here, so just saying "JavaScript isn't terrible" might encompass a huge amount of libraries and frameworks. But if you're like me, yeah, give things another shot- I'm somehow not hating on JavaScript anymore and... gulp... I kinda love it.

    See more
    Conor Myhrvold
    Tech Brand Mgr, Office of CTO at Uber · | 44 upvotes · 10M views

    How Uber developed the open source, end-to-end distributed tracing Jaeger , now a CNCF project:

    Distributed tracing is quickly becoming a must-have component in the tools that organizations use to monitor their complex, microservice-based architectures. At Uber, our open source distributed tracing system Jaeger saw large-scale internal adoption throughout 2016, integrated into hundreds of microservices and now recording thousands of traces every second.

    Here is the story of how we got here, from investigating off-the-shelf solutions like Zipkin, to why we switched from pull to push architecture, and how distributed tracing will continue to evolve:

    https://eng.uber.com/distributed-tracing/

    (GitHub Pages : https://www.jaegertracing.io/, GitHub: https://github.com/jaegertracing/jaeger)

    Bindings/Operator: Python Java Node.js Go C++ Kubernetes JavaScript OpenShift C# Apache Spark

    See more
    Git logo

    Git

    289.6K
    174K
    6.6K
    Fast, scalable, distributed revision control system
    289.6K
    174K
    + 1
    6.6K
    PROS OF GIT
    • 1.4K
      Distributed version control system
    • 1.1K
      Efficient branching and merging
    • 959
      Fast
    • 845
      Open source
    • 726
      Better than svn
    • 368
      Great command-line application
    • 306
      Simple
    • 291
      Free
    • 232
      Easy to use
    • 222
      Does not require server
    • 27
      Distributed
    • 22
      Small & Fast
    • 18
      Feature based workflow
    • 15
      Staging Area
    • 13
      Most wide-spread VSC
    • 11
      Role-based codelines
    • 11
      Disposable Experimentation
    • 7
      Frictionless Context Switching
    • 6
      Data Assurance
    • 5
      Efficient
    • 4
      Just awesome
    • 3
      Github integration
    • 3
      Easy branching and merging
    • 2
      Compatible
    • 2
      Flexible
    • 2
      Possible to lose history and commits
    • 1
      Rebase supported natively; reflog; access to plumbing
    • 1
      Light
    • 1
      Team Integration
    • 1
      Fast, scalable, distributed revision control system
    • 1
      Easy
    • 1
      Flexible, easy, Safe, and fast
    • 1
      CLI is great, but the GUI tools are awesome
    • 1
      It's what you do
    • 0
      Phinx
    CONS OF GIT
    • 16
      Hard to learn
    • 11
      Inconsistent command line interface
    • 9
      Easy to lose uncommitted work
    • 7
      Worst documentation ever possibly made
    • 5
      Awful merge handling
    • 3
      Unexistent preventive security flows
    • 3
      Rebase hell
    • 2
      When --force is disabled, cannot rebase
    • 2
      Ironically even die-hard supporters screw up badly
    • 1
      Doesn't scale for big data

    related Git posts

    Simon Reymann
    Senior Fullstack Developer at QUANTUSflow Software GmbH · | 30 upvotes · 9.2M views

    Our whole DevOps stack consists of the following tools:

    • GitHub (incl. GitHub Pages/Markdown for Documentation, GettingStarted and HowTo's) for collaborative review and code management tool
    • Respectively Git as revision control system
    • SourceTree as Git GUI
    • Visual Studio Code as IDE
    • CircleCI for continuous integration (automatize development process)
    • Prettier / TSLint / ESLint as code linter
    • SonarQube as quality gate
    • Docker as container management (incl. Docker Compose for multi-container application management)
    • VirtualBox for operating system simulation tests
    • Kubernetes as cluster management for docker containers
    • Heroku for deploying in test environments
    • nginx as web server (preferably used as facade server in production environment)
    • SSLMate (using OpenSSL) for certificate management
    • Amazon EC2 (incl. Amazon S3) for deploying in stage (production-like) and production environments
    • PostgreSQL as preferred database system
    • Redis as preferred in-memory database/store (great for caching)

    The main reason we have chosen Kubernetes over Docker Swarm is related to the following artifacts:

    • Key features: Easy and flexible installation, Clear dashboard, Great scaling operations, Monitoring is an integral part, Great load balancing concepts, Monitors the condition and ensures compensation in the event of failure.
    • Applications: An application can be deployed using a combination of pods, deployments, and services (or micro-services).
    • Functionality: Kubernetes as a complex installation and setup process, but it not as limited as Docker Swarm.
    • Monitoring: It supports multiple versions of logging and monitoring when the services are deployed within the cluster (Elasticsearch/Kibana (ELK), Heapster/Grafana, Sysdig cloud integration).
    • Scalability: All-in-one framework for distributed systems.
    • Other Benefits: Kubernetes is backed by the Cloud Native Computing Foundation (CNCF), huge community among container orchestration tools, it is an open source and modular tool that works with any OS.
    See more
    Tymoteusz Paul
    Devops guy at X20X Development LTD · | 23 upvotes · 8.2M views

    Often enough I have to explain my way of going about setting up a CI/CD pipeline with multiple deployment platforms. Since I am a bit tired of yapping the same every single time, I've decided to write it up and share with the world this way, and send people to read it instead ;). I will explain it on "live-example" of how the Rome got built, basing that current methodology exists only of readme.md and wishes of good luck (as it usually is ;)).

    It always starts with an app, whatever it may be and reading the readmes available while Vagrant and VirtualBox is installing and updating. Following that is the first hurdle to go over - convert all the instruction/scripts into Ansible playbook(s), and only stopping when doing a clear vagrant up or vagrant reload we will have a fully working environment. As our Vagrant environment is now functional, it's time to break it! This is the moment to look for how things can be done better (too rigid/too lose versioning? Sloppy environment setup?) and replace them with the right way to do stuff, one that won't bite us in the backside. This is the point, and the best opportunity, to upcycle the existing way of doing dev environment to produce a proper, production-grade product.

    I should probably digress here for a moment and explain why. I firmly believe that the way you deploy production is the same way you should deploy develop, shy of few debugging-friendly setting. This way you avoid the discrepancy between how production work vs how development works, which almost always causes major pains in the back of the neck, and with use of proper tools should mean no more work for the developers. That's why we start with Vagrant as developer boxes should be as easy as vagrant up, but the meat of our product lies in Ansible which will do meat of the work and can be applied to almost anything: AWS, bare metal, docker, LXC, in open net, behind vpn - you name it.

    We must also give proper consideration to monitoring and logging hoovering at this point. My generic answer here is to grab Elasticsearch, Kibana, and Logstash. While for different use cases there may be better solutions, this one is well battle-tested, performs reasonably and is very easy to scale both vertically (within some limits) and horizontally. Logstash rules are easy to write and are well supported in maintenance through Ansible, which as I've mentioned earlier, are at the very core of things, and creating triggers/reports and alerts based on Elastic and Kibana is generally a breeze, including some quite complex aggregations.

    If we are happy with the state of the Ansible it's time to move on and put all those roles and playbooks to work. Namely, we need something to manage our CI/CD pipelines. For me, the choice is obvious: TeamCity. It's modern, robust and unlike most of the light-weight alternatives, it's transparent. What I mean by that is that it doesn't tell you how to do things, doesn't limit your ways to deploy, or test, or package for that matter. Instead, it provides a developer-friendly and rich playground for your pipelines. You can do most the same with Jenkins, but it has a quite dated look and feel to it, while also missing some key functionality that must be brought in via plugins (like quality REST API which comes built-in with TeamCity). It also comes with all the common-handy plugins like Slack or Apache Maven integration.

    The exact flow between CI and CD varies too greatly from one application to another to describe, so I will outline a few rules that guide me in it: 1. Make build steps as small as possible. This way when something breaks, we know exactly where, without needing to dig and root around. 2. All security credentials besides development environment must be sources from individual Vault instances. Keys to those containers should exist only on the CI/CD box and accessible by a few people (the less the better). This is pretty self-explanatory, as anything besides dev may contain sensitive data and, at times, be public-facing. Because of that appropriate security must be present. TeamCity shines in this department with excellent secrets-management. 3. Every part of the build chain shall consume and produce artifacts. If it creates nothing, it likely shouldn't be its own build. This way if any issue shows up with any environment or version, all developer has to do it is grab appropriate artifacts to reproduce the issue locally. 4. Deployment builds should be directly tied to specific Git branches/tags. This enables much easier tracking of what caused an issue, including automated identifying and tagging the author (nothing like automated regression testing!).

    Speaking of deployments, I generally try to keep it simple but also with a close eye on the wallet. Because of that, I am more than happy with AWS or another cloud provider, but also constantly peeking at the loads and do we get the value of what we are paying for. Often enough the pattern of use is not constantly erratic, but rather has a firm baseline which could be migrated away from the cloud and into bare metal boxes. That is another part where this approach strongly triumphs over the common Docker and CircleCI setup, where you are very much tied in to use cloud providers and getting out is expensive. Here to embrace bare-metal hosting all you need is a help of some container-based self-hosting software, my personal preference is with Proxmox and LXC. Following that all you must write are ansible scripts to manage hardware of Proxmox, similar way as you do for Amazon EC2 (ansible supports both greatly) and you are good to go. One does not exclude another, quite the opposite, as they can live in great synergy and cut your costs dramatically (the heavier your base load, the bigger the savings) while providing production-grade resiliency.

    See more