Amazon CloudFront special request headers – Utilizing the Capabilities of the AWS Global Network at the Near Edge

In Chapter 2, we discussed the mechanisms Global Server Load Balancing (GSLB) systems such as Amazon Route53 use to determine the geographic location of a client’s IP address on the internet. Amazon CloudFront makes use of these same facilities to determine which edge location is closest to a given user so that it can steer them toward a cache containing the object they requested. It makes sense to use this data to populate additional header fields and pass them along for decision-making whenever a viewer makes a request:

  CategoryFieldsExamples
CountryNameUnited States
RegionRegion, Region-NameTX, Texas
LocalityCity, Postal-Code, Metro-CodeDallas, 75001, 214
CoordinatesLatitude, Longitude32.779167, -96.808891
TimeTime-ZoneAmerica/Chicago
Device TypeIs-MobileTrue
Is-DesktopFalse
Is-TabletTrue
Is-IOSTrue
Is-AndroidFalse
Is-SmartTVFalse

Figure 8.6 – Amazon CloudFront viewer headers for an iPad using 5G in Dallas, TX

These headers can be quickly enabled under the Behaviors section of an Amazon CloudFront distribution.

These headers make it easy for your distribution to do things such as the following:

Retrieve an image or video that is appropriately sized for the client browser without needing to maintain a matrix of what user-agent equals what kind of device

Respond with regionalized content that isn’t reliant upon the client headers – just because someone has the language set to English doesn’t mean they are in England

Comply with regulations that say a website operator must make every effort to ensure certain content is only served within a given region, country, state, or municipality

Of course, the same problems GSLB faces are present here. Someone connecting to a VPN service in a certain region will appear to be in that region.

Regional edge caches (RECs)

RECs represent a second tier of caching that isn’t as close as an edge location, but not as far as the origin. RECs are larger than edge POPs and therefore hold objects longer before being removed from the cache. This is useful in situations where a piece of content, say an MP4 video file for a training course, was heavily frequented by your users for a few days, then the activity dropped off… until a long 4-day weekend was over, at which point it began receiving requests again at 8 A.M. the first day back. Or maybe it’s an image for a product being sold on an e-commerce site that doesn’t get referred to often enough to be populated globally in all edge locations but does get hit enough to stay in the RECs consistently:

Figure 8.7 – Multiple stages of caching to limit the impact of misses