Where does your data warehousing belong? This question was the title of a Wired article, published in 2013, asking if we will ever overcome the challenges of storing data anywhere but on-premises. The report took an optimistic tone, quoting analyst predictions that, by now, the cloud world would be the leading place to store all types of data.
Seven years later, this may be true for data warehouses-congratulations Snowflake, but for database-backed applications that still isn’t the case. By January 2020, according to Forrester, only 20 percent of data-intensive applications had migrated to the cloud. In hindsight, this isn’t surprising. Data has always been important to companies, and after several years of the “data is the new oil/gold” narrative, can we blame data stewards for being very cautious?
Not only that, but the same problems that concerned enterprises back in 2013 still prevail – specifically control. If you put your data in a cloud, do you still have the same amount of control, including where and how it is stored? The answer is: You need to know what to ask for, and understand what degree of control is offered to you by your cloud service providers (CSPs) and any SaaS provider that handles data that you own. Many SaaS services don’t offer the necessary levels of transparency and data management to customers.
There are exceptions, particularly among VPC (Virtual Private Cloud) solutions. Here we’ve seen the CSPs lead the trend where companies can deploy data and containerized applications in VPCs that they control, deployed in geographically distributed regions that the data owner selects. Essentially, the hyperscaler provides the VPC, and the customer retains control similar to what they’d have with on-premises systems.
Yet there is a catch. This model works, provided you accept that you are pretty much married to the service provider’s infrastructure, their regions and database deployment services. And this risks lock-in and costs. So, there is less choice and more expense involved than you might see at face value.
The first point about infrastructure is significant, especially if you are concerned about data residency (where the data physically resides), data sovereignty (respecting the data laws where the company holds citizenship), and meeting data regulations such as GDPR or CCPA. But regarding security and ownership, if your VPC is beholden to a specific cloud provider, and you are using one of their data storage services, how can you be sure that you own your data? The same holds true for SaaS providers who offer you infrastructure plus data services. You must trust them 100%.
This arrangement results in you ceding considerable control over your data. As described in our whitepaper, The TCO Advantages Of Couchbase Cloud’s In-VPC Deployment, it is in your best interest to separate your infrastructure vendor from your database vendor. This way, you can change either one if you are dissatisfied without ceding control of your data. When your vendor supplies both the database and infrastructure, you may lose control over such things as locking that vendor out of accessing your data. The question moves beyond whether you know where your data physically resides. It goes much deeper: do you truly have the control you need to manage your data on your company’s terms and risk appetite? Who has the keys?
Fortunately, the answer does not end with what hyperscaler VPC solutions offer. In-VPC is an emerging best practice that separates the mutual inclusivity of cloud service provider infrastructure and the data storage and applications you need to run. Pioneering software vendors are using this model to abstract the control elements away from the infrastructure layer. This change improves your visibility and control over data (including shutting out anyone, even the database provider), enables you to blend different services and cloud providers (think true multi cloud), and delivers better management of costs.
To achieve this separation of powers requires that we first define three levels of priorities: first is data ownership and access, the second concern is infrastructure choice, configuration, and management, and the third is cost containment. In fact, this touches on another cloud data bugbear: once a provider has your data, do you have a safety net to change providers or negotiate better terms? An In-VPC strategy addresses all of these challenges.
In-VPC is a relatively new development. It leverages new but proven technologies like containerization and NoSQL databases to finally answer a two-pronged question that has been bugging the market since at least 2013: do you know where your data physically resides and do you have control over it? Can you revoke control from everyone? With In-VPC, the answers are yes, yes and yes.