What "restoration" actually means here
It is important to be honest about this up front: GFPGAN is not recovering the original information in a degraded face. The original information is gone — that is what "low resolution" or "blurry" means. What GFPGAN does is generate plausible facial detail consistent with what the model has learned about human faces from a large training set. The output is realistic, but it is also fictional. The eyes in the result are not your subject's actual eyes; they are eyes the model invented that fit the rest of the face.
For family photos, this is fine. The result looks like the person, captures their likeness, and conveys their expression. For anything where authenticity matters — legal evidence, identification, journalism — this kind of restoration is not appropriate without explicit disclosure.
How GFPGAN works
GFPGAN combines two ideas. First, a face detector locates and crops the face region from your input. Second, a generative facial prior (the FP in GFPGAN) is used to refine that crop. The prior is a pre-trained StyleGAN2 face generator — a model that already knows how to draw realistic faces in immense detail. GFPGAN does not start from scratch; it nudges the StyleGAN2 generator until its output matches the low-quality input. The result is a face that looks like the input but with all the texture, sharpness, and structure that StyleGAN2 brings.
The key innovation in the paper is how that "match the input" step is done. A simple loss on pixel-by-pixel difference would produce a blurry compromise. GFPGAN uses a combination of identity-preserving losses, perceptual losses, and adversarial losses to produce something that is sharp, photorealistic, and recognisably the same person.
When it works well
- Mildly low-resolution faces. A face that occupies 50×50 pixels in the original can come out looking like a sharp 200×200 portrait.
- JPEG compression artefacts. Heavy compression destroys facial detail; GFPGAN reconstructs it cleanly.
- Mild motion blur. If the blur is small relative to the face, the model can recover crisp features.
- Old photographs scanned at modest resolution. The combination of film grain and limited detail is exactly what GFPGAN is trained to handle.
- Faces in good lighting, looking roughly toward the camera. The closer to the training distribution, the better the result.
When it falls down
- Severe blur or extremely low resolution. A 20×20 pixel face has too little signal; the model fills in mostly from its prior, and the result may not look like the person at all.
- Side profiles. StyleGAN2 was trained mostly on near-frontal faces. Profile shots can come out distorted.
- Faces wearing masks, sunglasses, or heavy makeup. The model wants to draw the eyes and mouth it expects.
- Older subjects. The training data skews younger; GFPGAN sometimes smooths wrinkles and softens skin in ways that look slightly age-shifted.
- Children with adult features, or vice versa. If the input ambiguously falls between the model's typical age categories, results can look odd.
- Multiple overlapping faces. The detector may merge or miss faces in dense group photos.
Practical tips
- Start with the highest-quality scan or copy you have. Even modest improvements at the source compound through the model.
- Crop the photo so the face is prominent. The 512px downscale we apply hurts more if the face is a tiny region of a wide shot.
- If you have multiple versions of the same photo, try each. The model can produce noticeably different outputs from minor variations in the input.
- For very old photos, fix damage first. Run the Photo Healer first; it incorporates a similar restoration step but also handles scratches, tears, and dust.
- Disclose AI restoration in any caption. Especially for genealogy and historical work.
An honest comparison
If you have used Topaz Gigapixel, Adobe's Photoshop Neural Filters, or Remini, you've used a similar class of tool. GFPGAN is on roughly the same tier as those products and has the advantage of being open-source and runnable for free. It does not have the polish of a paid product (no batch processing, no manual tuning, no inpainting brush), but for single-photo restoration the output quality is competitive.