Explanations & Tutorials

Lessons Learned Migrating from GCP to AWS

The tools and analysis we used as we migrated to support our IoT platform on AWS.

Yitaek Hwang

As more enterprises embrace multi- and hybrid-cloud strategies, Leverege wanted to evaluate how we can support our IoT platform on AWS. On GCP, our architecture consists of microservices running on GKE, using Pub/Sub as the message queue, Cloud Functions for serverless workloads, and Firebase and BigQuery for our real-time and OLAP databases. Not all of the products had one-to-one match in terms of feature parity, so we had to consider alternatives. 

Projects vs. Accounts

GCP uses Projects under Organizations to logical separation, whereas AWS requires new Accounts that can be grouped under AWS Organization. Once AWS Organization was set up, the difference between the two paradigms were negligible, but switching between Projects on GCP was easier (at least on the web console) compared to switching accounts on AWS.

The big difference within Projects/Accounts is that GCP does not group services by regions using their global backbone. This is a significant boost to companies running multi-region architectures, but since we were mostly using single region deployments, the difference was minimal. 

GKE vs. EKS

GKE provides an excellent Kubernetes experience, taking care of the management of the master node, networking plugins, seamless upgrades, logging/monitoring, as well as autoscaling. On EKS, nothing comes pre-configured besides the master node. The first challenge we ran into was how difficult it was to spin up a new cluster via the console UI. In the end, we settled on Terraform to create the EKS cluster, but the experience was not as easy as simply clicking create cluster on the GKE console. 

The next thing we noticed was the lack of Kubernetes tools. While GKE provides a web management UI for GKE workloads that is integrated with Stackdriver for monitoring, EKS expects users to install metrics server and Kubernetes Dashboard. Sharing access to Kubernetes Dashboard also requires another step to either use an Identity-Aware Proxy or a VPN to protect against unauthorized access. Running StatefulSets on EKS was also tricky as getting Cluster Autoscaler to respect EBS volume location meant having to deploy Cluster Autoscaler per node group based on workload type. Finally, running Kubernetes upgrades required manual patching of Kubernetes components on EKS whereas GKE automatically updated components based on a schedule within a maintenance window. 

Firebase vs. DynamoDB

At Leverege, we use Firebase for two purposes: (1) as a NoSQL database to store last known state of IoT devices, (2) use push mechanisms to update all Firebase clients (e.g. iOS/Android apps, web applications). On AWS, there was no single product to replace this feature. One option was to use DynamoDB Streams and write custom listeners on all clients. The other recommended approach was to push data to AWS AppSync and use its features. None of the options were ideal given the complexities, so we decided to keep Firebase and treat it as a SaaS solution for a key-value store. 

BigQuery vs. RedShift

The huge draw of BigQuery on our end was the decoupling of storage and analysis. We can stream all IoT data into BigQuery, but only be charged for how much data was queried. Since our read to write ratio was heavily geared towards writes, BigQuery’s pricing model was optimized for our usage. On the other hand, RedShift bundles storage and analysis, so the cost of pre-provisioning nodes to run RedShift was prohibitive until we scaled our read workloads. The compromise here was to use Snowflake and run it on AWS. This way, we could still decouple storage and analysis, yet have the underlying system run on AWS infrastructure. 

PubSub vs. Kinesis/SNS/SQS

Cloud Pub/Sub is the only managed messaging service GCP provides. On AWS, there is Kinesis, SNS, and SQS that serve different needs. Since most of our payloads are small and require multiple consumers (e.g. data being written to both raw data store and historical data storage), Pub/Sub’s multiple publisher-subscriber relationships worked great with our message flow. On AWS, Kinesis was too expensive for small payloads, and SNS/SQS alone did not support the fan-out architecture we needed. The solution was to have multiple SQS queues subscribe to the same SNS topic to replicate the Pub/Sub behavior we needed. 

Summary

In terms of containers and Kubernetes, Google holds the definitive edge here. After all, Google invents Kubernetes and has the most experience running containers in the world. However, AWS provides a variety of options that users can pick and choose from. Besides Firebase, we were able to find equivalent or alternative solutions to migrate our platform from GCP to AWS. 

Yitaek Hwang

VP, Product Innovation

From traveling the world solving vision issues in underserved regions through ViFlex to building software to diagnose autism using machine learning, I realized that I like building things. So currently I’m on a path to build an Internet of Things (IoT) platform at Leverege as a Venture for America Fellow.

View Profile

More From the Blog