Skip to content

request: make (gr)avatar URL a setting (to support non-public avatar sources) #292

@juleskers

Description

@juleskers

Is your feature request related to a problem? Please describe.

During #291 you implemented support for SHA256 Gravatars (thanks!)
I actually wasn't aware of the improved hash support, so started researching what else had changed since I last updated my knowledge on this topic.
Spoiler: a lot 😁

TL;DR: a LOT of sources now implement the <some-base-url>/<sha256-of-email>?params convention, so the "gravatar mechanism" can easily be extended for other providers (including local, offline self-hosted ones)

Below is a summary of my research this evening (+ night, I'm quite past my bed-time; self-nerd-snipe yo!)
Perhaps I shouldn't be surprised that the intersection between "user identity" and "privacy in distributed systems" is a complicated topic 👼

Describe the solution you'd like

GRavatar(.com) has set a de-facto standard for a template string: <some-base-url>/<sha256-of-email>?params.

Thus, to support other avatar-hosts, all that is needed is making "avatar URL template" a setting, rather than a hardcoded string.
Special-casing null (or "disabled", or NoOpAvatarFetchingStrategy or whatever) can then function as #291's/(6d8c5dc) useGravatar-boolean.

For convenience, a couple of "presets" (such as for gravatar.com, or libravatar should probably be provided too, that set a pre-configured URL-template.
(so ideally: a radio-button listing "disabled(the default) / gravatar.com / libravatar.org / custom (enables second text input)")

Importantly: all alternatives I've found support SHA256, so no configurable hashing strategy is needed, which simplifies technical implementation and user-facing configuration enormously.
MD5 seems to be outdated, though still-working for some implementations. After the privacy-busting attacks demonstrated against it already in 2009, it seems newer implementations just skipped over MD5 completely, and started directly with SHA256).

Re-identification risk remains, regardless of hash-type

I'd like to point out that opt-out continues to be important, even with SHA256 hashing.
The fundamental design of "stable hash of stable identifier" allows cross-identifying users across websites/programs. Indeed that is the entire design goal: get a consistent picture/profile everywhere.

It doesn't matter if one uses MD5, SHA256 or PBKDF or even ARGON2: as long as the input is unsalted-email, the output token will be stable across locations, and thus correlating (for example) a stackoverflow account to that sensitive blogpost from a decade ago remains possible.
The only way around this is to add site-specific (or even user-specific) salts into the gravatar-hash, thus deviating from the gravatar-protocol. This (intentionally) breaks avatar-lookup, and because users cannot sign-up to the gravatar-host with a salted input email, they lose any possibility of customising their (auto-generated) avatar.
StackOverflow does this intentionally, salting the email if they don't find a gravatar-account for the unsalted email. Additionally, they provide a stack-hosted "uploaded image" alternative to avoid gravatar completely. I consider this the "state of the art" for balancing gravatar-support (by default) with privacy (by choice). It is non-trivial to implement though, requiring multiple rounds of lookup, and a completely separate secondary avatar system (static image).

Describe alternatives you've considered

I see (roughly) four levels of complexity in solving the general problem of "pictures for user identity".
I want to list all of them (and some variations I came up with) in order to have a full overview, and to inform thinking about proper software architecture (depending on how many you think are worth covering, perhaps a Strategy Pattern for avatar-sourcing implementations may be worthwhile)

  1. Do nothing: just use committer name+email in text form.
    • no cost/benefit evaluation is complete without the literally-zero-cost solution of doing-nothing-at-all.
    • The least hassle, but the post-"gitnuro is private" promise not consistent with gravatar usage #291 status quo is already more featureful than this.
    • This is the most powerful in terms of privacy preserving: since no third party whatsoever is involved, only what is provided in the git-repo itself
  2. local, file-based lookup
    • have a config directory for avatars on disk (e.g. on linux $XDG_CONFIG_DIR/gitnuro/avatars/<commiter.email>.png. Gitnuro user can populate this manually.
    • perfectly privacy-preserving, since no network lookups are involved, only local storage
    • avatar is configurable separately for each individual gitnuro user (my avatar-for-me doesn´t need to be same as your avatar-for-me)
    • lots of manual legwork (fetching images, naming files) for gitnuro user, thus not attractive from an end-user perspective
    • any "smart" pre-fetching of images from some centralised upstream quickly becomes identical to option 4: custom gravatar-style url.
    • modern languages/libraries make fetching from random internet servers as easy as fetching from disk, so techical implementation is about as hard as network-based alternatives, I think(?).
  3. hardcoding a single gravatar-style provider
    • this was the status-quo before "gitnuro is private" promise not consistent with gravatar usage #291
    • Privacy can-of-worms instantly fully open: leaking of stable personal identifiers for any/all committers over network, to single, fixed trusted(?) party.
    • Regardless whatever caching is done, at least one lookup must be done, thus "uncloaking" the ID's existence.
    • Depending on cache-strategy: also leaks usage patterns of gitnuro itself.
    • No way to please everyone with only a single, hardcoded provider.
  4. configurable gravatar-style provider (multiple providers and/or custom URL):
    • This what I am suggesting in this feature request
    • technical implementation only slightly more complex than single hardcoded URL, thus not something I consider a separate level of complexity.
    • privacy can-of-worms somewhat open, if the user chooses to enable the feature, and the avatar-host is unfriendly.
    • by configuring a trusted URL, privacy can be preserved.
      For example: a central company git-server already knows all committer-ids, and timestamps of when you commit/push/pull. it learns nothing new from its avatar-api: neither about committer identities, nor about your working times.
    • can support non-public avatar sources, such as self-hosted git-servers on company intranet.
    • This level automatically unlocks "central" libravatar: https://wiki.libravatar.org/description/
  5. fully-federated libravatar: gitnuro discovers gravatar-url for each committer-email-domain.
    • technically the most challenging, since full federation requires host-discovery (via DNS SRV records), not just string-template replacing.
    • The only libravatar java library that I found doesn't implement federation (https://github.com/alessandroleite/libravatar-j)
    • While I personally think it's a worthy goal, I consider federation out-of-scope for this feature request.
      A fixed configurable URL will serve 80% of use-cases with 20% of the implementation effort compared to federation.

context, documentation and research notes

  1. gravatar.com the original which started it all
  2. my $DAYJOB uses self-hosted gitea.
    Gitea supports configurable URLs for avatar-fetching, but uses the gravatar-style SHA256 hash.
    • Server Configuration docu cheatsheet > avatar section
    • todo: lookup template for intranet instance (from memory: https://git.dept.intra.company.com/avatars/ or something like that)
      certificate into JVM keystore, or gitnuro provides "trust-on-first-use" ("TOFU") prompt (and handles verification against local gitnuro keystore), or wildly-unsafe-but-easy "disable HTTPS cert validation" boolean setting)
  3. Libravatar.org offers a FOSS alternative to gravatar.com
    • centralised lookup template: https://seccdn.libravatar.org/gravatarproxy/${the-hash}?s=512&default=identicon
    • ivatar software (and some alternative implementations) available for self-hosting
  4. public gitea.com seems to allow user-specified URLs, for example: - multiple entries use https://seccdn.libravatar.org/gravatarproxy/${the-hash}?s=512&default=identicon
    • a few seem to have some sort of "local CDN" in the form of https://4d3e0f26919f429c2b0092fb846c818a.r2.cloudflarestorage.com/gitea-com-prod/avatars/${the-hash}?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=<...lots more X-Amz headers
  5. Gitnuro Alternative sourcegit exists.
  6. random unsorted concern: self-hosted instances with self-signed intranet HTTPS certificates will be "fun"; either user must import a certificate into the system java keystore, or gitnuro will have to show a "Trust on First Use (TOFU)"-prompt (and store the answer in an extra gitnuro keystore, or a (wildly-unsafe) "skip certificate validation" boolean setting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions