Amazon CloudFront uses a construct known as distributions to establish and govern this behavior. When a new distribution is created, it is pointed at an origin – this is the server or service that holds the content we want to cache in our POPs. Amazon CloudFront supports several types of origins – including S3 buckets, MediaStore containers/MediaPackage channels in Elemental, application load balancers, AWS Lambda functions, EC2 instances, and even servers that aren’t part of AWS at all (custom origins):

Figure 8.5 – Content caching at the near edge
Consider a situation where we set up a new CloudFront distribution for an S3 bucket located at https://mybucket.s3.eu-west-2.amazonaws.com/. The first time a request comes in from a client in Helsinki for s3://mybucket/image1.jpg, the file will be retrieved by an edge POP in Helsinki from the London region (eu-west-2) and put into that edge POPs cache. This is known as a cache miss. 2 hours later, someone in Vaasa (close to Helsinki) requests image1.jpg. Because they are routed through the edge POP in Vaasa, they will retrieve the image directly from that POP’s cache rather than having to go back to the London region. This is known as a cache hit. The ratio of hits to misses is called the cache hit ratio. Performance is best with a high ratio.
An object remains in the cache until it is expired, at which point the edge POP will delete it entirely from its cache (if no one has requested it in a long enough period) or fetch a fresh copy. This happens if it is still popular but has exceeded a designated time-to-live (TTL).
You don’t want to cache everything. A good example would be files such as index.php or cart.aspx – these are dynamically generated. If you cache a shopping cart page, it will never update between users and you will never be able to add things to your basket. However, the GIF or JPG files for the products listed on these pages should be cached. It is not uncommon for a web page to consist of mixed elements like this – some static, some dynamic. Therefore, it is typical that some content be set to cache and other content to never cache. These rules can be set as behaviors of the distribution.