Building big iron in the cloudwith Google Compute Engine
Iron Ore
Cloud computing is a fundamental evolutionary step in the world of computing. Using the cloud lowers initial investment, reduces costs, and improves ROI, with the option to be elastic, scalable, and infinitely performant. It's the best of both worlds for technical people and bean counters alike. Google Compute Engine is a stellar IaaS (Infrastructure as a Service) example that is part of a larger suite of options, including the App Engine PaaS (Platform as a Service), Cloud SQL, Cloud Storage, Cloud Datastore, and many other services.
The Google cloud allows you to build anything you might need for a modern IT infrastructure via Google's Cloud Platform [1]. In this article, I focus on the initial setup and configuration of the Google Compute Engine IaaS solution. Before I dive in, though, take a look at the many Google cloud services (Table 1), which should cover just about anything you would want to build.
Tabelle 1: Google Cloud Services
Name |
Service |
---|---|
Google Compute Engine |
Allows you to build hosted virtual machines in Google's cloud. As an IaaS option, it allows you to right scale and reduce your costs at the same time. |
Google App Engine |
Allows you to focus on coding your application and not on infrastructure configuration, or administration. As a PaaS solution, it supports the Python, Java, PHP, and Go languages. |
Google Cloud SQL |
Fully managed MySQL at your fingertips, with no headaches of administration, scaling, or replication. |
Cloud Storage |
Highly available storage service for your data. If offers standard storage or durable reduced availability (think backup), OAuth, and granular access control. |
Cloud Datastore |
Offers a managed NoSQL database for storing non-relational data. This is a robust, powerful solution without the headaches. |
BigQuery |
Lets you analyze big data in the cloud with massive scalability and the fast data queries required of big data problems. |
Cloud DNS |
Google's high-performance domain name system (DNS) in the cloud. It is manageable via the command line or scriptable with Python. |
Cloud Endpoints |
Allows you to create RESTful services available to iOS, Android, and JavaScript clients. It features DDoS protection, OAuth support, and client key management. |
Translate API |
Helps you translate your application to another language programatically. It supports most popular languages, but sadly lacks Klingon support. Lu'? |
Prediction API |
Allows use of Google's machine learning algorithms to analyze data. It is a powerful way to gain insights into future trends using historical data. |
Cloud Deployment Manager |
Provides a way to design, create, and deploy system templates. It also lets you actively monitor the status of your Google Cloud post-deployment. |
Cloud SDK |
A powerful group of tools and libraries for orchestrating your Google Cloud deployments, with the ability to control App Engine, Compute Engine, Cloud Storage, BigQuery, and Cloud SQL. |
Push-to-Deploy |
Lets you use Git to deploy your application automatically to App Engine. It works for applications written in Python, PHP, and Java. |
Cloud Playground |
Lets you try out App Engine, Cloud Storage, or Cloud SQL right in your browser. It also supports importing projects directly from GitHub. |
Android Studio |
Allows you to to develop, debug, and put your code to work in Google's Cloud Platform from this new Android development environment. |
Google Plugin for Eclipse |
Software development tools for Java developers to aid in the design, build, and deployment of cloud-based App Engine applications. |
Google Compute Engine
Google Compute Engine was opened to the public in June 2012, a bit later than most other players in the cloud marketplace. Arrival time aside, it is a powerful, scalable, and performant IaaS solution.
Compute Engine allows you quickly and easily to create anything from a simple single-node VM to a large-scale compute cluster on Google's world class infrastructure. As of this writing, it supports several stellar open source Linux distributions (and one closed-source option), including Debian and CentOS; CoreOS, FreeBSD, and SELinux [2]; and Red Hat Enterprise Linux, SUSE, and Windows [3].
Instances are available with many options and are completely customizable from a hardware perspective. You can choose the number of cores, RAM, and other machine properties, and you can scale them as you grow [4]. Virtual instances start at a micro instance (f1-micro), with one core and 0.60GB of memory, and go up to 16 cores and 104GB of RAM. For the sake of the demo here, I will be using a shared core micro instance (g1-small; Table 2). Competition from Amazon, Microsoft, Rackspace, and others in the cloud marketplace has put increasing downward pressure on the price of many cloud offerings.
Tabelle 2: Small Instances
Instance type |
Virtual Cores |
Memory |
Price (US$/hour) US hosted |
Price (US$/hour) Europe hosted |
Price (US$/hour) Asia Pacific hosted |
---|---|---|---|---|---|
f1-micro |
1 |
0.60GB |
$0.013 |
$0.014 |
$0.014 |
g1-small |
1 |
1.70GB |
$0.035 |
$0.0385 |
$0.0385 |
Constructing your own server in Google Compute Engine (GCE) is easy. Your first requirement is to have a Google account. If you have any experience with any other cloud platform (e.g., Amazon AWS and its AWS management console), you will feel right at home in the Google Developer Console [5].
Like other cloud services, GCE offers both a UI (user interface) and API (application programming interface). In this article, I focus on the basics of using the UI and related utilities to get you up and running quickly.
Setting Up Your Project
To get started with the GCE, you need to set up a project name and ID. To begin, choose a project name and project ID, then click Create Compute Engine Instance, as is detailed in Figure 1. As with other cloud computing services, you have a dizzying array of options from which to choose.
On the left side of the window, select Compute Engine. To run your instance, you first will have had to set up payment. Simply click on Setup Billing, fill in the required information, and submit it.
Creating a New Instance
Once you have created a project and entered your billing information, you are finally ready to add a new Google Compute Engine instance. As shown in Figure 2, you need to fill in the name of the server and any other desired configuration. The only elements you might need to customize to your own needs include:
- Zone: Specifies the geographic region in which your virtual instance and its data will be located. Generally choose the one closest to the clients you will serve.
- Machine type: Lets you choose system specifications in terms of processors and RAM.
- Image: Lets you choose an operating system from the many supported options. Currently supported OSs are Debian, CentOS, Red Hat Enterprise Linux, SUSE, and Windows [3].
- Network: Specifies the network that traffic can access. In this example, default is correct.
- External IP: Specifies the external address allotted when the instance is created. Here, the default Ephemeral address is bound to the instance as long as it exists.
Now that you have created a project and set up a GCE instance through the web interface, I'll explore setting up, managing, and controlling the project through Google's suite of command-line tools via the Cloud SDK.
Cloud SDK
The Google Cloud SDK is a set of tools and libraries to create and manage your Google Cloud. It supports App Engine, Compute Engine, Cloud Storage, BigQuery, Cloud SQL, and Cloud DNS. Before going further, you must meet the following Cloud SDK requirements:
- Python 2.7.x
- Java 1.7+ (for App Engine)
- A supported OS: Windows (requires Cygwin [6]), Mac OS X, Linux
To set up Gcutil [7], you must download and install the Google Cloud SDK. On the Linux distro of your choice, enter the commands
$ curl https://dl.google.com/dl/cloudsdk/release/install_google_cloud_sdk.bash | bash $ unzip google-cloud-sdk.zip $ ./google-cloud-sdk/install.sh $ gcloud auth login
to transfer the SDK to your machine, unzip the file, run the installation script, and authenticate to the Google Cloud.
Authentication with OAuth2
Google Compute Engine uses the OAuth2 standard for authentication and authorization to access the Google Cloud. OAuth allows users to share data with your website or application while keeping their username and password – and other sensitive information – private.
With a Cloud SDK and authentication, you can now SSH into your new instance. As you see (Listing 1), Google Cloud SDK sets up key-based authentication and takes you right into the instance specified in the gcutil
command: gcerocks-instance1.
Listing 1: SSH into an Instance
01 $ gcutil ssh gcerocks-instance-1 02 03 joe@m0nk3y:~/google-cloud-sdk$ gcutil ssh gcerocks-instance-1 04 INFO: Zone for gcerocks-instance-1 detected as us-central1-b. 05 WARNING: You don't have an ssh key for Google Compute Engine. Creating one now... 06 Enter passphrase (empty for no passphrase): 07 Enter same passphrase again: 08 INFO: Updated project with new ssh key. It can take some time for the instance to pick up the key. 09 INFO: Waiting 10 seconds before attempting to connect. 10 INFO: Running command line: ssh -o UserKnownHostsFile=/dev/null -o CheckHostIP=no -o StrictHostKeyChecking=no -i /home/joe/.ssh/google_compute_engine -A -p 22 joe@1.2.3.4 -- 11 Warning: Permanently added '1.2.3.4' (ECDSA) to the list of known hosts. 12 Enter passphrase for key '/home/joe/.ssh/google_compute_engine': 13 Linux gcerocks-instance-1 3.2.0-4-amd64 #1 SMP Debian 3.2.54-2 x86_64
Note that it is always good practice to put in a strong passphrase when asked to do so. Never leave it blank. Also mind the security of the local machine you use to manage your Google Cloud.
With a Cloud SDK set up, you have a range of utilities to manage your cloud (Table 3). If you use Gcutil standalone, it automates the setup of key-based authentication for SSH access to your instance. Gcutil uploads and creates a public/private key and uploads your public key to the cloud. Finally, it associates the key with your Google account, giving you access to any instance you create. As always, setting up Gcutil with key-based authentication is helpful but means little if you fail to add a strong passphrase to protect your key and lock down your local machine.
Tabelle 3: Google Cloud Utilities
Utility |
Function |
---|---|
|
Deploy and manage Google App Engine. |
|
Manage cloud resources (e.g., authentication, configuration) and workflow [8]. |
|
Manage Google Cloud SQL. |
|
Manage Google Compute Engine. Just as from the web console, you can manage from the CLI [9]. A few examples of how to use this tool are: |
|
Show current version of Gcutil |
|
Add an Google Compute Engine instance of specified name. |
|
Remove a GCE instance. |
|
List current GCE instances. |
|
List all available commands. |
|
Manage Google Cloud storage. |
Gcutil standalone (deprecated) |
Gcutil is the central tool used to manage your Google Compute Engine, but it was once distributed as a standalone tool. As of late, Google is encouraging the use of the Google Cloud SDK over the previous standalone Gcutil utility to consolidate development tools under one suite of tools [10]. |
Firewall in the Cloud
Next, you need to set up your cloud instance by configuring a firewall and adding persistent storage. All new instances by default block all external traffic, which is a smart security move from Google; default deny is always a good idea. To make the services you install available, you need to open up the firewall rules to that newly created instance.
To create a new firewall rule, click Networks, choose the default network (created with this instance), and go to Firewall | Create a new Firewall. Where you see default rules, click Create new. For example, Figure 3 shows an Nginx web server with HTTP on port 80 and HTTPS (SSL/TLS) on 443.
Adding Storage
GCE has two kinds of storage: scratch disks and persistent disks. When you create a GCE instance, for example, you get a default disk of 10GB. This "scratch space" storage shouldn't be used to save mission critical data and can't be used to share data; instead, you should use a persistent disk.
A scratch disk is tied to the virtual instance itself and will not be as performant as persistent storage with Google Cloud. Remember, scratch storage isn't where you store or back up critical data – unless you like to lose data – because you might delete and recreate instances.
A persistent disk is separate from any instance and exists outside your virtual instances. You can think of these as your virtual enterprise cloud storage that you create, format, and mount to make available to your instances.Adding a persistent disk can be done both with gsutil and from the web GUI. For the sake of space and to get you up and running quickly, I will use the quickest method: the web console. Again, those familiar with almost any other cloud provider will feel right at home with the ease of use and power of the Google Cloud Platform.
Adding persistent storage is as easy as going to the Google Cloud Console and navigating to Compute Engine | Disks and then New Disk (Figure 4). Fill in a name for this disk and any related description; then, pick a zone (same zone as you specified before or it will not work) and select a source type of None (blank disk).
Finally, select a size for the new persistent disk and click Create, then click on your instance and scroll down to the Disks section. Select attach and add the disk you just created with read/write (Figure 5). Now you should SSH into your instances and look at your current disks (Listing 2):
$gcutil ssh gcerocks-instance-1 joe@gcerocks-instance-1:~$ sudo fdisk -l
Listing 2: Listing Disks
01 Disk /dev/sda: 10.7 GB, 10737418240 bytes 02 4 heads, 32 sectors/track, 163840 cylinders, total 20971520 sectors 03 Units = sectors of 1 * 512 = 512 bytes 04 Sector size (logical/physical): 512 bytes / 4096 bytes 05 I/O size (minimum/optimal): 4096 bytes / 4096 bytes 06 Disk identifier: 0x0001e258 07 08 Device Boot Start End Blocks Id System 09 /dev/sda1 2048 20971519 10484736 83 Linux 10 11 Disk /dev/sdb: 536.9 GB, 536870912000 bytes 12 255 heads, 63 sectors/track, 65270 cylinders, total 1048576000 sectors 13 Units = sectors of 1 * 512 = 512 bytes 14 Sector size (logical/physical): 512 bytes / 4096 bytes 15 I/O size (minimum/optimal): 4096 bytes / 4096 bytes 16 Disk identifier: 0x00000000 17 18 Disk /dev/sdb doesn't contain a valid partition table
Next you need to add a partition table, format it, make a mount point (here, /mnt/pdisk
), and mount the new disk:
joe@gcerocks-instance-1:~$fdisk /dev/sdb joe@gcerocks-instance-1:~$mkfs.ext3 /dev/sdb joe@gcerocks-instance-1:~$mkdir /mnt/pdisk joe@gcerocks-instance-1:~$mount /dev/sdb /mnt/pdisk
Finally, you can see your new disk available in its almost 500GB of glory (Listing 3).
Listing 3: Viewing a Disk
01 joe@gcerocks-instance-1:~$ df -hl 02 Filesystem Size Used Avail Use% Mounted on 03 rootfs 9.9G 722M 8.7G 8% / 04 udev 10M 0 10M 0% /dev 05 tmpfs 171M 108K 171M 1% /run 06 /dev/disk/by-uuid/a3864f53-b3b7-4a6d-9a27-548305aa6594 9.9G 722M 8.7G 8% / 07 tmpfs 5.0M 0 5.0M 0% /run/lock 08 tmpfs 342M 0 342M 0% /run/shm 09 /dev/sdb 493G 198M 467G 1% /mnt/pdisk
Finale
Now that you have created an instance, set up Cloud SDK, and added some storage, you're on your way. I hope you've enjoyed this quick overview of the Google Compute Engine and that I've provided some introductory insights into this compelling platform. With the beginnings of your cloud infrastructure set up, you are primed to build whatever you like with this powerful IaaS cloud option, so have some fun in the cloud playground.