If you’re working with AWS Infrastructure, you may know that currently there are some tools/frameworks support to implement your AWS infrastructure such as CloudFormation, Terraform, AWS CDK. What is the best tool that your team can rely on and use? What is the best tool can help you increase productivity and quality? Do you have the answer? If not, read this post, I will give you our answer and reasons.
Recently our team has been working on a project with data crawling from various sites for pricing comparison. Our team has selected Puppeteer to implement. We have successfully built and delivered this solution to our customer, so I would like to write this article to share and provide some outlines steps to help you set up and have a try. I hope you enjoy it!
Recently, we have worked on a project that uses Neo4j to store and process large graph data for our client. Our client has been asked for a solution to launch, install and configure a Neo4j single node (for the development environment and High Availability Neo4j cluster (for production environment). Our team has selected Ansible to implement this requirement, if you wanted to know why we selected Ansible, check out this article for more details. If you wanted to know why we selected Ansible, check out this article.
The main key point for all our projects is trying to automate all things, it helps to reduce errors (e.g. human mistake), fast and easy to deploy / rollback, and improve customer satisfaction.
In this article, we will explore the process of extracting data from an AWS RDS database, and then publishing it to S3 with AWS Glue. We will cover the following details:
- The ability to support data stored in Amazon Aurora and all other Amazon RDS engines, Amazon Redshift, and Amazon S3 (such as XML, CSV format)
- Big data size
- Different data schemas
- A solution to easily switch environments from Development → Test → User Acceptance Testing (UAT) → Staging → Production
- Autoscale hardware related to data size