Amazon EC2

Application and Data / Application Hosting / Cloud Hosting

CTO at La Cupula Music SL·Jul 9, 2019

Shared insights

Our base infrastructure is composed of Debian based servers running in Amazon EC2 , asset storage with Amazon S3 , and Amazon RDS for Aurora and Redis under Amazon ElastiCache for data storage.

We are starting to work in automated provisioning and management with Terraform.

8 upvotes·631K views

Ashish Singh

Tech Lead, Big Data Platform at Pinterest·Nov 27, 2019

Shared insights

To provide employees with the critical need of interactive querying, we’ve worked with Presto, an open-source distributed SQL query engine, over the years. Operating Presto at Pinterest’s scale has involved resolving quite a few challenges like, supporting deeply nested and huge thrift schemas, slow/ bad worker detection and remediation, auto-scaling cluster, graceful cluster shutdown and impersonation support for ldap authenticator.

Our infrastructure is built on top of Amazon EC2 and we leverage Amazon S3 for storing our data. This separates compute and storage layers, and allows multiple compute clusters to share the S3 data.

We have hundreds of petabytes of data and tens of thousands of Apache Hive tables. Our Presto clusters are comprised of a fleet of 450 r4.8xl EC2 instances. Presto clusters together have over 100 TBs of memory and 14K vcpu cores. Within Pinterest, we have close to more than 1,000 monthly active users (out of total 1,600+ Pinterest employees) using Presto, who run about 400K queries on these clusters per month.

Each query submitted to Presto cluster is logged to a Kafka topic via Singer. Singer is a logging agent built at Pinterest and we talked about it in a previous post. Each query is logged when it is submitted and when it finishes. When a Presto cluster crashes, we will have query submitted events without corresponding query finished events. These events enable us to capture the effect of cluster crashes over time.

Each Presto cluster at Pinterest has workers on a mix of dedicated AWS EC2 instances and Kubernetes pods. Kubernetes platform provides us with the capability to add and remove workers from a Presto cluster very quickly. The best-case latency on bringing up a new worker on Kubernetes is less than a minute. However, when the Kubernetes cluster itself is out of resources and needs to scale up, it can take up to ten minutes. Some other advantages of deploying on Kubernetes platform is that our Presto deployment becomes agnostic of cloud vendor, instance types, OS, etc.

#BigData #AWS #DataScience #DataEngineering

Presto at Pinterest - Pinterest Engineering Blog - Medium (medium.com)

38 upvotes·1 comment·2.9M views

Kaibo Hao

January 28th 2020 at 12:46AM

ECS on AWS will reduce your cost on EC2 and Kubernetes. Athena may be another tool for reducing your cost by replacing the Presto. It takes advantage of the S3 as the storage and provided the serverless management for your infrastructure.

Arthur Boghossian

DevOps Engineer at PlayAsYouGo·Feb 6, 2020

Shared insights

(

)

For our Compute services, we decided to use AWS Lambda as it is perfect for quick executions (perfect for a bot), is serverless, and is required by Amazon Lex, which we will use as the framework for our bot. We chose Amazon Lex as it integrates well with other #AWS services and uses the same technology as Alexa. This will give customers the ability to purchase licenses through their Alexa device. We chose Amazon DynamoDB to store customer information as it is a noSQL database, has high performance, and highly available. If we decide to train our own models for license recommendation we will either use Amazon SageMaker or Amazon EC2 with AWS Elastic Load Balancing (ELB) and AWS ASG as they are ideal for model training and inference.

3 upvotes·141K views

Simon Reymann

Senior Fullstack Developer at QUANTUSflow Software GmbH·Apr 27, 2020

Chose

Kubernetes

over

Docker Swarm

QUANTUSflow Software GmbH

(

qfl-stack

)

Our whole DevOps stack consists of the following tools:

GitHub (incl. GitHub Pages/Markdown for Documentation, GettingStarted and HowTo's) for collaborative review and code management tool
Respectively Git as revision control system
SourceTree as Git GUI
Visual Studio Code as IDE
CircleCI for continuous integration (automatize development process)
Prettier / TSLint / ESLint as code linter
SonarQube as quality gate
Docker as container management (incl. Docker Compose for multi-container application management)
VirtualBox for operating system simulation tests
Kubernetes as cluster management for docker containers
Heroku for deploying in test environments
nginx as web server (preferably used as facade server in production environment)
SSLMate (using OpenSSL) for certificate management
Amazon EC2 (incl. Amazon S3) for deploying in stage (production-like) and production environments
PostgreSQL as preferred database system
Redis as preferred in-memory database/store (great for caching)

The main reason we have chosen Kubernetes over Docker Swarm is related to the following artifacts:

Key features: Easy and flexible installation, Clear dashboard, Great scaling operations, Monitoring is an integral part, Great load balancing concepts, Monitors the condition and ensures compensation in the event of failure.
Applications: An application can be deployed using a combination of pods, deployments, and services (or micro-services).
Functionality: Kubernetes as a complex installation and setup process, but it not as limited as Docker Swarm.
Monitoring: It supports multiple versions of logging and monitoring when the services are deployed within the cluster (Elasticsearch/Kibana (ELK), Heapster/Grafana, Sysdig cloud integration).
Scalability: All-in-one framework for distributed systems.
Other Benefits: Kubernetes is backed by the Cloud Native Computing Foundation (CNCF), huge community among container orchestration tools, it is an open source and modular tool that works with any OS.

30 upvotes·2 comments·9M views

Larry Gryziak

April 30th 2020 at 6:34PM

So why is your deployment different for your (Heroku) test/dev and your stage/production?

Simon Reymann

May 1st 2020 at 10:32AM

When it comes to testing our web app we do not demand great computational resources and need a very simple, convenient and fast PaaS solution for deploying the app to our testers. In production though, the demand of great computational resources can rise very fast. With Amazon we are able to control that in better way.

Jigar Shah

Security Software Engineer at Pinterest·Jul 2, 2020

Needs advice

and

(

)

We would like to detect unusual config changes that can potentially cause production outage.

Such as, SecurityGroup new allow/deny rule, AuthZ policy change, Secret key/certificate rotation, IP subnet add/drop. The problem is the source of all of these activities is different, i.e., AWS IAM, Amazon EC2, internal prod services, envoy sidecar, etc.

Which of the technology would be best suitable to detect only IMP events (not all activity) from various sources all workload running on AWS and also Splunk Cloud?

8 upvotes·143.9K views

Replies (5)

Nati Abebe

Cloud Architect at AWS·Jul 15, 2020

Recommends

AWS Config

For continuous monitoring and detecting unusual configuration changes, I would suggest you look into AWS Config.

AWS Config enables you to assess, audit, and evaluate the configurations of your AWS resources. Config continuously monitors and records your AWS resource configurations and allows you to automate the evaluation of recorded configurations against desired configurations. Here is a list of supported AWS resources types and resource relationships with AWS Config https://docs.aws.amazon.com/config/latest/developerguide/resource-config-reference.html

Also as of Nov, 2019 - AWS Config launches support for third-party resources. You can now publish the configuration of third-party resources, such as GitHub repositories, Microsoft Active Directory resources, or any on-premises server into AWS Config using the new API. Here is more detail: https://docs.aws.amazon.com/config/latest/developerguide/customresources.html

If you have multiple AWS Account in your organization and want to detect changes there: https://docs.aws.amazon.com/config/latest/developerguide/aggregate-data.html

Lastly, if you already use Splunk Cloud in your enterprise and are looking for a consolidated view then, AWS Config is supported by Splunk Cloud as per their documentation too. https://aws.amazon.com/marketplace/pp/Splunk-Inc-Splunk-Cloud/B06XK299KV https://aws.amazon.com/marketplace/pp/Splunk-Inc-Splunk-Cloud/B06XK299KV

11 upvotes·2 comments·67.3K views

Ethan Grubber

April 16th 2021 at 1:16PM

The key difference between Enterprise and Cloud is you have no control over the underlying infrastructure with Cloud. You can install and manage apps using the familiar GUI, but any changes to the platform (permissions changes, program installation, etc.) are done by Splunk support via a ticket. https://www.treeservicedenvercolorado.com/greeley-colorado.html

Loki Robles

August 26th 2021 at 2:54AM

Image result for AWS Config. If you are using AWS Config rules, AWS Config continuously evaluates your AWS resource configurations for desired settings. <a href="https://www.rooferslakewood.com/">Lakewood Roofing Company</a>

Isaac Povey

Casual Software Engineer at Skedulo·Jul 2, 2020

Recommends

Terraform

While it won't detect events as they happen a good stop gap would be to define your infrastructure config using terraform. You can then periodically run the terraform config against your environment and alert if there are any changes.

6 upvotes·1 comment·69.1K views

Thiago Arrais

July 20th 2020 at 11:43AM

I would like to hear about using Terraform in combination with Open Policy Agent and/or Hashicorp's Sentinel for that purpose. Does anyone have any experience with that?

View all (5)

Needs advice

and

We need to perform ETL from several databases into a data warehouse or data lake. We want to

keep raw and transformed data available to users to draft their own queries efficiently
give users the ability to give custom permissions and SSO
move between open-source on-premises development and cloud-based production environments

We want to use inexpensive Amazon EC2 instances only on medium-sized data set 16GB to 32GB feeding into Tableau Server or PowerBI for reporting and data analysis purposes.

5 upvotes·284.4K views

Replies (3)

John Nguyen

kreuzwerker·Aug 13, 2020

Recommends

Airflow

AWS Lambda

You could also use AWS Lambda and use Cloudwatch event schedule if you know when the function should be triggered. The benefit is that you could use any language and use the respective database client.

But if you orchestrate ETLs then it makes sense to use Apache Airflow. This requires Python knowledge.

4 upvotes·224.8K views

Raj Chandrasekaran

Aug 7, 2020

Recommends

Airflow

Though we have always built something custom, Apache airflow (https://airflow.apache.org/) stood out as a key contender/alternative when it comes to open sources. On the commercial offering, Amazon Redshift combined with Amazon Kinesis (for complex manipulations) is great for BI, though Redshift as such is expensive.

3 upvotes·232.3K views

View all (3)

Needs advice

and

Hi, I'm building a machine learning pipelines to store image bytes and image vectors in the backend.

So, when users query for the random access image data (key), we return the image bytes and perform machine learning model operations on it.

I'm currently considering going with Amazon S3 (in the future, maybe add Redis caching layer) as the backend system to store the information (s3 buckets with sharded prefixes).

As the latency of S3 is 100-200ms (get/put) and it has a high throughput of 3500 puts/sec and 5500 gets/sec for a given bucker/prefix. In the future I need to reduce the latency, I can add Redis cache.

Also, s3 costs are way fewer than HBase (on Amazon EC2 instances with 3x replication factor)

I have not personally used HBase before, so can someone help me if I'm making the right choice here? I'm not aware of Hbase latencies and I have learned that the MOB feature on Hbase has to be turned on if we have store image bytes on of the column families as the avg image bytes are 240Kb.

4 upvotes·139.5K views

Itamar Kasztelanski

It and DevOps manager at Tzunami Inc.·Jan 31, 2021

Needs advice

Ansible

and

AWS CodeDeploy

We have a .NET application hosted on Amazon EC2 instances. We have a WEB server instance and application instances. We use VS and TFS server for development, build and release. For deployment, we use a script we wrote. The script is activated at the end of the release process. What is the best tool for the automatic deployment of our application?

5 upvotes·3.8K views

Replies (1)

Anthony Chiboucas

Software Engineer & Support Operations Lead ·Feb 5, 2021

Recommends

AWS CodeDeploy

While I haven't use AWS Code myself, I have a few years of experience with Ansible.

It doesn't play nice with Windows.
Dependency hell is real.
- Playbooks and tasks are built off of other playbooks and tasks, which are built off other playbooks and tasks. Actually figuring out what Ansible is really doing, to install mysql for instance, requires digging several layers deep into other ansible scripts.
Debugging is a chore.
- When a dependency needs updating, or there's a newer pattern for install/update of OS and libs, it can be very difficult to find what parts of your ansible script need updating, and where to update. Often it's in one of those playbooks you've built from, and you're left waiting for them to update, or taking on that whole playbook yourself.

3 upvotes·2.5K views

waheed khan

Associate Java Developer at txtsol·Nov 1, 2021

Needs advice

Amazon EC2

and

AWS Elastic Beanstalk

My Stack

I am working on a full-stack application [Spring Boot (Java), AngularJS 7, MySQL] and Apache Maven as a build tool => I need to deploy and host this web app on AWS. I searched about it and find out I have to use PAAS. There are 2 things. 1- AWS Elastic Beanstalk 2- Amazon EC2 my question is that what services should I use to deploy and host my web app.

7 upvotes·30.9K views

Replies (3)

Kevin Deyne

Principal Software Engineer at Accurate Background·Nov 19, 2021

Recommends

Heroku

Technically, these and many others would work. In fact, Elastic Beanstalk uses EC2. EC2 is just the service that provisions the machines where code can run. Elastic Beanstalk is basically a layer on top of that, that hides some of the EC2 complexities.

But complexity is a key thing to consider here. There is a lot of configuration that goes into setting up a deploy environment that is secure and stable. Unless you're an infrastructure expert, I would leave a direct EC2 setup alone.

If you, as a developer, have to set up a deployed app with no infrastructure team to support you, I would opt for something that does the most abstracting away of the complexities: So either Elastic Beanstalk or something like Heroku. I personally use Heroku for my personal projects, because of its ease of use.

5 upvotes·12.5K views

John Cornett

DevOps | Senior Developer ·Nov 22, 2021

Recommends

Here is my recommendation...and I do this sort of thing all the time.

Create a VPC with public and private networks. Launch a t3.small instance with Amazon Linux and install Jenkins in your public subnet of the network. Make sure all your Java dependencies are there...which they should be. If not, install them.

Create your Elastic Beanstalk application with Spring Boot, Java and Maven...which should be the Corretto 11 running on 64bit Amazon Linux 2/3.2.8 (as of today). You will need to have a file named Procfile in the root of your project. This will initiate your app start up. It should contain something like:

web: java -Dserver.port=8084 -jar build/libs/myapp-*.jar (relative to the root of the project)

In Jenkins you will make a project for building your Java application. In the project, you simply add the instructions in a shell script exactly like you would do it from the linux command line. You can also find Maven plugins. It's up to you and you can figure out how best to do that.

Your EB App and Environment should deploy the load balancer in the public subnet. Your Java application should deploy in the private network. These are all part of the EB configuration. You will need to create a security group that allows port access from your load balancer to your application. Also, you should create a certificate in Certificate Manager for your domain, which should be setup in Route53. In EB, you can then configure your load balancer to always use that cert.

Your Angular application should be built in its own project on Jenkins. Then you should deploy it to S3 with Cloudfront as CDN in front of the S3 bucket. After each deployment, you should sync to S3 deleting all previous contents of the bucket. You also need to invalidate the cache for your Cloudfront distribution. This ensures your application is fresh and has all your updates and changes each deployment. You should apply your DNS routing to your Cloudfront distribution as well via Route53. There's documentation on doing all this.

To allow Jenkins to deploy to Elastic Beanstalk as well as S3 (and also perform Cloudfront invalidations on publish), simple create a Role in IAM that allows the permissions to the services you need. Once you have that Role, you should apply it to your EC2 instance that is running Jenkins.

Finally, your MySQL database should be in RDS. If production, use Multi-AZ, otherwise just launch what you need. Your DB should also be launched in your private subnet. You will need to create another security group for the DB as well. The DB security group should allow access from your application security group to your DB security group on port 3306 or whatever port you run on.

In Jenkins you will need to install any plugins you need for your git repository (bibucket, github, etc). In your repository settings, enable a webhook to your Jenkins server in the settings. The URL should be something like https://build.mysite.com/bitbucket-hook/. Your projects should be separate for the Java app, build, and deploy. Similarly, your Angular app, build, and deploy. Each project should be in a separate repo with its own webhook. Separating your app, from your DB, and your frontend is best practice. It allows you to have room to scale each component independently and also decouples everything...API first concept. It also forces best practice security setup...Zero Trust concept.

So there are some specific suggestions. The nuts and bolts though are: MySQL in RDS. Java on Elastic Beanstalk, and your Angular application in S3 with a Cloudfront Distribution in front. Use Certificate Manager for your SSL and Route 53 for all your DNS. Figure all that out and you will have an industry stand stack that is ready for performance and scale.

It's true what others have said. Elastic Beanstalk is simply EC2, Application Load Balancer, Security Groups and a few other AWS services. You will see all your instances, security groups, load balancers, etc...where you'd expect them to be. However it makes it all turnkey...Cloudwatch, redundancy, scaling, deployment strategy, and subnet placement. EB has some idiosyncrasies, but building what it does on your own is much more work. If you want to get deeper into customizing your instances and web servers...research .ebextensions and .platform which you can drop in your project source and it will launch your stacks EXACTLY like you want them. Hopefully your setup is straightforward though and you won't need much of that.

Good luck!

4 upvotes·12.1K views

View all (3)

Needs advice

and

Hi, here's how the story goes.

We started transforming a monolith, single-machine, e-commerce application (Apache/PHP) to cloud infrastructure. Obviously, the application and the database (MySQL) were on the same machine.

We decided to move to AWS. And as the first step of transformation, we decided to split the database and application. Hosting application on a c4.xlarge machine. And hosting database to RDS Aurora MySQL on a db.r5.large machine, with default options.

This setup performed well. Especially the database performance went up high.

Unfortunately, when the traffic spiked up, we started experiencing long response times. Looked like RDS, although being really fast for executing queries, wasn't returning results fast enough over the network to the Amazon EC2 machine.

So that was our conclusion after an in-depth analysis of the setup including Apache/MySQL/PHP tuning parameters. The delayed response time was definitely due to the network latency between EC2 and RDS/Aurora machine, both machines being in the same region.

Before adding additional resources (ex: ElastiCache etc) we'd first like to look into any default configuration we can play around to solve this problem.

What do you think we missed there?

5 upvotes·25.6K views

Replies (4)

John Cornett

DevOps | Senior Developer ·Jan 19, 2022

Recommends

Amazon EC2

Amazon RDS

If you are using the Aurora Serverless option and not enough initial compute capacity, it could be in a position where it is always scaling up and down and causing you latency issues. I dealt with a client and our solution was to move away from serverless to an EC2 based implementation with fixed resources adequate enough to handle the load needed.

4 upvotes·24K views

Alexandre Lemaire

Founder at Circlical·Jan 19, 2022

I've handled absolutely incredible burst traffic with RDS/EC2. I have two questions:

Have you enabled the RDS slow and index-less query logs to spot problematic queries in your design?

2.1 Are your RDS and EC2 instances in the same availability zone?

2.2 In the same VPC?

3 upvotes·3 comments·24.2K views

dleblanc-vidcruiter

January 19th 2022 at 3:25PM

If you're convinced it's network related points 2.1 and 2.2 here are crucial, for optimal performance you MUST have your ec2 instance and your RDS instance in the same VPC connecting over internal connections.

Asif Kolachi

January 21st 2022 at 10:45AM

Hi Dleblanc. Yes, they're in the same VPC. But internal connection is something I don't know. Do you mean AWS PrivateLink? RDS security group is already configured to block outside traffic except the EC2. The RDS endpoint I am using is the one provided in RDS console. That DNS based hostname. That's why I think connection isn't private and local. Can you please suggest me more to read about the internal connection?

Asif Kolachi

January 21st 2022 at 10:40AM

Thanks Alexandre. Yes for all these points. This difference in latency is more prominent in small queries when run in a big number. I know I can optimize for a better alternative query approach, but question here is the network latency.

View all (4)